Archive for April, 2004

Old enough for a Real Language

Tuesday, April 27th, 2004


Old fart posting…


Bill Clementson and Dave Roberts point out a thread on comp.lang.lisp that is
unearthing a thought provoking sample of people who only became fans of Lisp late in the life.


I made a commitment to Lisp when I was in my mid-30s. It was quite calculated. I’d written a huge amount of software in various languages by that point; including L* back in the 70s which had similar depth to Lisp. I had spent a few years suffering thru writing a big ugly Ada program and I’d become quite disillusioned with strong typing. It worked but it was so damn hard. Just to debug the damn thing I’d ended up writing a fun system in Prolog. Just to build the system we had writing lots of programs that in turn wrote the program that we would then compile and ship. The build cascades of that systems were neat, wonderfully adhoc. A little sed, a little m4, a little awk, a custom lex/yacc do hicky, etc. etc. Amazingly weird make files.


At about that point Guy Steel’s first version of the Common Lisp manual came out. The most elegantly written language specification I’ve ever read. Just chapter after chapter of language facilities that I’d had to build by hand again and again. All beautifully designed. It was clear that Lisp would just subsume all all the junk I’d had to write both before and after writing my Ada system.


So I went off and got myself a job at an AI company that managed to go down the tubes with great vigor.

But, I got to write some really fun software there. For example I wrote a system that would allow you declare what I called a data-structure-tour. You would sketch how to tour each individual data structure; usually next to it’s declaration. They you could then write statements that said things like “Tour e over X as a window-display doing redraw(e).” Man was that fun.

TypeKey - The Central Authority

Tuesday, April 27th, 2004


Let me take a stab at solving the second of the two large problems I
see with the current TypeKey design.


The Problem


In it’s most brutal form the second problem with TypeKey is that it is
a land grab. It puts Six Apart in the position, intentionally or not,
of making exactly the same mistake that Microsoft made with Passport.


The design, presumably for simplicity, assumes that there is one
authority that everybody using TypeKey will turn to for their
authentication services; i.e. www.typekey.com. That makes
typekey.com the central authority for some universe of
authentication. Today this is blog comments. Tommorrow it might be
wiki contributions. The next day - who knows?


While at first blush this appears to be very valuable turf to grab, if
you are too greedy you destroy the value of the turf your getting.
That’s the lesson that Microsoft hopefully has learned from Passport.


Sure, you can build a system with a single central authority. Yes,
people will sign up for it. Trouble is then force other serious players
into a subordinate position. Other powerful players don’t like that.
They have no interest playing a subordinate role to you.


The second problem with having a single central authority is that it
encourages the emergance of a monopoly. While I may think very highly
of the folks at Six Apart, I don’t think so highly of them as to
believe they should be encouraged to grab a dominate role in teh
authentication of contributors of open/free content.


Solving the problem, technically, isn’t that hard. It is harder
than solving the design flaw of revealing a global unique identifier
for everybody, though.


A Solution



What is required?


Users and sites need to be able to sign up for with multiple
authentication services.


The more the merrier. In fact if you design with the presumption that
there will be a few hundred or thousands of these “authorities” that
would be best.


The TypeKey design bounces the user over to the single central authority.


In a design that avoids a single central authority the site needs to
infer one or more authorities to bounce the user over to.


The key added complexity is getting a list of these authorities before
we bounce the user over to them for authentication. This take two steps.
The simpler TypeKey design requires only one. First we lookup the
user’s perfered authorities. Secondly we ask one or more of them to
authenticate/vouch for the user.


A Possible Implementation


How to lookup the user’s perferred list of authorities?


There are lots of fanciful ideas for how to do this - most are not
practical. We might modify the installed base of web browsers so that
users could send their prefered set of authorities as part of their
browsing. We might have the user run thru a proxy server provided by
his ISP and that proxy server could insert the list of perfered
authorities.


There are two reasonably practical approaches.


First we could introduce a central authority who’s only role is to
return the user’s list of perfered authorities. The TypeKey folks
could volunteer to do that, in effect offering to redirect queries
about a given user to other authorities if that user has asked for
that.


Alternately we could use tricks involving browser cookies. Each site
would then use these cookies to get the user’s authority list. This
solution is somewhat better, at least it’s faster, than the first
solution. It has a similar design challenge that somebody would
have to manage the domain used to hold the cookies shared over all
these sites.


Neither of these solutions is too hard to implement. Both solve the
problem of enabling a single dominate vendor for the role that TypeKey
is working to fill.


One final point. I don’t believe that introducing the mechinisms will
reduce the success of TypeKey as a major player in the blog
authentication industry. In fact I suspect that making changes along
these lines will accellerate adoption because it will reduce the
paranoia that is created by making the role of authentication server
scarce.

TypeKey - Eliminating the Universal ID

Monday, April 26th, 2004


The first of the two big problems I see with TypeKey is easy to solve. Just don’t do it.


The first problem is that TypeKey encourages the wide distribution of a single unique identifier for it’s users - all over the net. Each site that uses TypeKey is given the same unique identifier for a user. This makes it significantly easier to invade the privacy of the users. For example if I visit sites on: depression, cats, damsels in distress, and terrorist strategies this unique identifier enables other parties to collect all that information about my behavior.

The fix is simple. Don’t hand out a universal identifing token!

Instead TypeKey should hand out a different token to each site. If the site wishes to obtain additional information it’s users it then has two choices. It can ask the user. It can go back to TypeKey. That matches up with user’s expections. Users do not expect sites that want to know more about them to go around conspiring with other sites.


Of course an identity system isn’t much use if you can’t use it to find out more about users. So the current design allows sites to query TypePad: “Tell me about user 12.” This needs to change so they ask: “This is site 35 tell me more about user 14.” TypePad can then assure it only tells sites it trusts this additional information. As an added benefit this also allows TypePad to let users configure exactly what things should be revealed to which sites.


Nothing about this alternative design precludes users from revealing as much as they like. What’s key is not to build a system that enables unnecessary revealing. Particularly not revealing by parties who users just happen to interact with incidentally.


I’m sure that it wasn’t Six Apart’s intent to create a foundation that helps to enable invading user’s privacy. But sadly that’s what they are headed for.

TypeKey - Two Flaws

Monday, April 26th, 2004


Having written a longer discussion of how TypeKey
appears to work I want to try a shorter a summary.


TypeKey tackles an important problem - spam on open web sites. Open
sites want to encourage lots of contributions. Open sites do not want to
their openness abused by spammers. TypeKey provides authentication services to help address the problem.


TypeKey provides a central authority. Sites use it to authenticate
visitors. This works by bouncing those users over to their central
authority which then sends them back with some identifying
info. Users might not even be aware it was happening since in simple cases it requires no user interaction.


The design is flawed in at least two ways.


First. The system reveals a universal identifier for the user. That
empowers random sites to invade the privacy of their users by
conspiring with other sites to aggregate a model of from what
ever the users revealed at various sites. Worse it reveals this universal identifier to anyone eavesdropping in on internet traffic. That’s a gift to
snoopers.


Second. No mechanism is provided to support multiple central
authorities. This makes the role of authentication authority scarce.
It creates a single point of failue. It has the unfortunate side effect of
of putting Six Apart in the position of hording that role. That
will cause suspicion. It will delaying wide spread adoption. All that is
is good for the spammers and bad for the open web.


Neither of these is necessary. Both can be fixed.

TypeKey - Revealed

Saturday, April 24th, 2004


Over here
somebody has been doing some reverse engineering of the TypeKey
authentication protocol. I want to pick that apart a bit since I think
this is going to spread fast once it’s released. It’s a good thing
for people to know what’s going on. There are things here I don’t approve
of.


This is a pretty typical cross site sign-on system; people have been
building stuff like this ever since browsers started to support
automatic redirect - so for a very long time in Internet time.


I find it helps when telling these stories to personify the players
a bit. We need three players in our little story:

  • Victoria: The site that would like to authenticate a vistor.
  • Fido: The visitor to that site. “On the Internet nobody knows your a dog.”
  • Arthur: The authentication service. (Arthur’s an authority)


Well, maybe four because meanwhile off stage there is always Mr. Evil.


Victoria would like to trust Fido so she can let him put stuff on her
site. Victoria trusts Arthur to vouch for Fido. Fido has some light
weight relationship with Arthur that allows Arthur to assert that he
is actually dealing with Fido today.


These mechinisms work like the “letter of introduction” that travelers
used to bring with them when they traveled to a foriegn city. That
letter would introduce the young man, that’s Fido in our story, to somebody
reputable in the foriegn city, that’s Victoria in our story. The letter
would be written by somebody that Victoria knew and trusted, that’s Arthur
in our story.


This mechinism needs to get a letter of introduction to move from
Arthur via Fido to Victoria. We can do that with browser redirection.
When Fido shows up at Victoria’s site she bounces him over to Arthur’s
site and Arthur bounces him back with the “letter of introduction.”


That can all happen pretty quickly, and it doesn’t even need to
require any interaction with Fido. You can do it redirecting a small
embedded image so it’s pretty low impact on the user experiance. Of
course it might be polite to ask Fido before you do it.


Arthur, who wants to have a good relationship with Victoria and with
Fred, needs to be careful about the politeness thing. For example,
Fred’s not going to be pleased if Arthur starts introducing him to all
kinds of strangers. Understanding how to get systems like this to be
respectful of people’s privacy demands thinking thru the incentives
for the various parties as well as thinking about worse case scenarios.


In TypeKey the role of Authur is played by the folks at Six Apart who
make TypePad and MovableType. A lot of people, me included, think these
are good people.


Here’s is how the bounce is implemented. Victoria
bounces Fido over to Authur using a URL like this:


   https://www.typekey.com/t/typekey/login?t=<blog_token>&_return=<return_url>



This in effect says “Yo, Arthur? - You know this dude? - Victoria.”
or in this example it’s “Yo www.typekey.com? - You know this dude? -
<blog_token> <return_url>.” The return_url is where the Arthur should
reply to. Blog_token is very a very obscure thing
(e.g. “LjRd2DpifL51sB0iFeYT”, or “R234″), but it’s just Arthur’s pet
name for Victoria. That token is meaningless (opaque) it’s used only
in conversations between Arthur and Victoria.


   https://www.typekey.com/t/typekey/login?t=R234&_return=http://viki.com/http://viki.com/handler.cgi?a=_return



Arthur won’t answer these question from just anybody. He only answers it
for folks like Victoria he has a good relationship with. So that pet
name is important. Arthur’s going to look up the status of Victoria’s
relationship with him using that.


Before he replies he’s likely to do a couple things. For example he’s
likely to check if Fido is ok with this automatic authentication
bounce back thing. If Fido’s not approved that Arthur might tell
Victoria that she needs ask Fido to let down his privacy guard a bit.


Arthur is also responsible for being careful to check that he doesn’t
answer questions from Mr. Evil just because he happens to know the
Victoria’s pet name. One part of that is to only reply to a return_url
that has was explicitly registerd by Victoria.


Of course Arthur’s also going to check and see if this is Fido.
Recall that we are actually bouncing Fido’s web browser over to
Arthur; so Arthur can use cookies he left there to figure out if he
knows this dude or not.


Assuming Arthur can convince himself that this is Fido then he bounces
Fido’s browser back to Victoria using a URL that looks like this:


  <return_url>&ts=<timestamp>&email=<hashed_email>&name=<name>&nick=<display_name>&sig=<crypto_signature>



For example:


  http://viki.com/handler.cgi?a=_return&ts=1294252341&email=adg3adfa3&name=Fred&nick=Good+Dog&sig=vuXeBVRJG2cR4xl81+HoeJMbKYs=:DBASoTXIQtlxs07jRblTLRk=



I don’t think much of this reply’s design. Let’s pick it apart.


In the trade this reply is known as a “statement” or an “assertion.”
It is the letter of introduction of old. Like that letter it is
signed, by Arthur, so it’s recipient Victoria can know who is making this
statement.


The entire assertion should be encrypted, as it is the statement is
revealing things about Fido to folks listening in on the conversation
between Victor and Arthur. That’s just rude of the them. In the best
of worlds this would be sent using https. That would require that
Victoria have a https server. That’s a pain. But, if Victoria can
check that signature then she can probably decrypt the reply packed
into the message.


You can see what Arthur has decided to tell Victoria about Fido. Three
things: “name,” “nick,” and “email.”


I don’t know why it’s good to tell Victoria all this info about Fido.
The usual argument is people make for such revealing is that it allows
Victoria to fill out forms for Fido faster; giving him a more pleasing
user experiance. But shouldn’t that be up to Fido? In the example I’ve
shown above they have revealed that Arthur happened to know that Fido’s
name is Fred; and that he has used the nickname “Good Dog.”


I don’t see why that was any of Victoria’s business, since at this point
Fido and Victoria have just met and all Victoria needs to know is if it’s ok to allow Fido to post a comment on her blog.


The email address isn’t really an email address. It’s another bizzare
token: “adg3adfa3″ (actually it was really:
“a4426c6a28b21941e3de8b14541e10e5aabb24e8″).


Most people’s first reaction to a jumble like that is “ok that’s safe.”
It’s a good habit to get into to then think. “I wonder what database
that might be a key into?” or “I wonder what message is encrypted in
that?”


In this case that’s the key for looking up what’s known as a FOAF file.
FOAF is the mnemonic for the cheerful sounding “Friend of a Friend” file.
Recall at this point that “cookie” sounds nice too; but that doesn’t
prevent them being used for evil.


Recalling out letter of introduction example the reason Arthur
included that in his reply to Victoria was similar to writing “If you
want to know more about my friend Fido, take a look here.


FOAF files are cool. For example
here’s a view of href="http://www.netspade.com/tools/foaf?uri=http%3A%2F%2Fwww.cozy.org%2Fben%2Ffoaf.rdf">my
FOAF file. I decided to reveal the information you see there. Did I get permission from all those folks (my so called “friends”) in there to reveal their names? Nope. That was kind of rude of me.


That complex looking string is known as a “mbox_sha1sum”, it is based on an
email address, but let’s call it the foaf universal identifier. Or maybe we
should call it the universal id. Or maybe we should just call it the devil’s
mark. The problem is that one you know somebodies’ mbox_sha1sum you all
set to go looking for their FOAF file in any database that happens to have
it.


This is the same problem that arise if you have a national ID number. It
makes it far to easy for data about an individual to be sought out from
multiple sources. It far to likely that if a database of personal data is
leaked by mistake that Mr. Evil will be able to tie that data to other
data from other sources. It encourages identity theft.


Arthur and Victoria should create a private handle that they use to discuss
Fido. I notice that used just such a private handle when for Victoria
when they began their converastion about Fido; what I called a “pet name” above. One step in that direction would be to use a one off email address for Fido that Arthur gives him, that would help avoid revealing Fido’s personal information stored in other FOAF files. That would help a little.


But the fact is that Fido may want to reveal somethings to Victoria and other things to other folks. That requires an entirely private handle for Authurs discussions with Victoria about Fido. Without that is it possible for Victoria to trade private information about Fido with others who interact with Authur. That’s exactly the problem that cookies enabled. It’s what allows a company like double click to track your web browser as it travels over a set of sites of their clients to create a more complete model of you. But this is somewhat worse because Arthur has enabled the trading of information about Fido and he doesn’t any means to control the situation going forward.


There is one additional thing that bothers me about the TypeKey design.


But first I want to make perfectly clear that I believe that it is not practical to build useful authentication systems in the web without encouraging folks like Arthur appearing on the scene. Both Fido (aka Fred) and Victoria need trusted third parties to help make introductions and lubricate the flow of that information which they want to reveal. The design challenge is to be sure that we don’t encourage the emergance of a single Arthur. We want to encourage a huge number of them.

So the last thing, or maybe the first, that concerns me about the TypeKey design is that it lacks a mechnism to encourage more than one Arthur’s emergance. That’s not that hard. It just demand that when Victoria starts the conversation with Arthur she has a way to discover which Arthur(s) Fido would prefer that she use. She can then select one or more that she trusts and proceed from there. But, you got to get that right up front.

Other junk about identity that I’ve written.