Category Archives: identity

TypeKey – Eliminating the Universal ID

The first of the two big problems I see with TypeKey is easy to solve. Just don’t do it.

The first problem is that TypeKey encourages the wide distribution of a single unique identifier for it’s users – all over the net. Each site that uses TypeKey is given the same unique identifier for a user. This makes it significantly easier to invade the privacy of the users. For example if I visit sites on: depression, cats, damsels in distress, and terrorist strategies this unique identifier enables other parties to collect all that information about my behavior.

The fix is simple. Don’t hand out a universal identifing token!

Instead TypeKey should hand out a different token to each site. If the site wishes to obtain additional information it’s users it then has two choices. It can ask the user. It can go back to TypeKey. That matches up with user’s expections. Users do not expect sites that want to know more about them to go around conspiring with other sites.

Of course an identity system isn’t much use if you can’t use it to find out more about users. So the current design allows sites to query TypePad: “Tell me about user 12.” This needs to change so they ask: “This is site 35 tell me more about user 14.” TypePad can then assure it only tells sites it trusts this additional information. As an added benefit this also allows TypePad to let users configure exactly what things should be revealed to which sites.

Nothing about this alternative design precludes users from revealing as much as they like. What’s key is not to build a system that enables unnecessary revealing. Particularly not revealing by parties who users just happen to interact with incidentally.

I’m sure that it wasn’t Six Apart‘s intent to create a foundation that helps to enable invading user’s privacy. But sadly that’s what they are headed for.

TypeKey – Two Flaws

Having written a longer discussion of how TypeKey
appears to work I want to try a shorter a summary.

TypeKey tackles an important problem – spam on open web sites. Open
sites want to encourage lots of contributions. Open sites do not want to
their openness abused by spammers. TypeKey provides authentication services to help address the problem.

TypeKey provides a central authority. Sites use it to authenticate
visitors. This works by bouncing those users over to their central
authority which then sends them back with some identifying
info. Users might not even be aware it was happening since in simple cases it requires no user interaction.

The design is flawed in at least two ways.

First. The system reveals a universal identifier for the user. That
empowers random sites to invade the privacy of their users by
conspiring with other sites to aggregate a model of from what
ever the users revealed at various sites. Worse it reveals this universal identifier to anyone eavesdropping in on internet traffic. That’s a gift to
snoopers.

Second. No mechanism is provided to support multiple central
authorities. This makes the role of authentication authority scarce.
It creates a single point of failue. It has the unfortunate side effect of
of putting Six Apart in the position of hording that role. That
will cause suspicion. It will delaying wide spread adoption. All that is
is good for the spammers and bad for the open web.

Neither of these is necessary. Both can be fixed.

TypeKey – Revealed

Over here
somebody has been doing some reverse engineering of the TypeKey
authentication protocol. I want to pick that apart a bit since I think
this is going to spread fast once it’s released. It’s a good thing
for people to know what’s going on. There are things here I don’t approve
of.

This is a pretty typical cross site sign-on system; people have been
building stuff like this ever since browsers started to support
automatic redirect – so for a very long time in Internet time.

I find it helps when telling these stories to personify the players
a bit. We need three players in our little story:

  • Victoria: The site that would like to authenticate a vistor.
  • Fido: The visitor to that site. “On the Internet nobody knows your a dog.”
  • Arthur: The authentication service. (Arthur’s an authority)

Well, maybe four because meanwhile off stage there is always Mr. Evil.

Victoria would like to trust Fido so she can let him put stuff on her
site. Victoria trusts Arthur to vouch for Fido. Fido has some light
weight relationship with Arthur that allows Arthur to assert that he
is actually dealing with Fido today.

These mechinisms work like the “letter of introduction” that travelers
used to bring with them when they traveled to a foriegn city. That
letter would introduce the young man, that’s Fido in our story, to somebody
reputable in the foriegn city, that’s Victoria in our story. The letter
would be written by somebody that Victoria knew and trusted, that’s Arthur
in our story.

This mechinism needs to get a letter of introduction to move from
Arthur via Fido to Victoria. We can do that with browser redirection.
When Fido shows up at Victoria’s site she bounces him over to Arthur’s
site and Arthur bounces him back with the “letter of introduction.”

That can all happen pretty quickly, and it doesn’t even need to
require any interaction with Fido. You can do it redirecting a small
embedded image so it’s pretty low impact on the user experiance. Of
course it might be polite to ask Fido before you do it.

Arthur, who wants to have a good relationship with Victoria and with
Fred, needs to be careful about the politeness thing. For example,
Fred’s not going to be pleased if Arthur starts introducing him to all
kinds of strangers. Understanding how to get systems like this to be
respectful of people’s privacy demands thinking thru the incentives
for the various parties as well as thinking about worse case scenarios.

In TypeKey the role of Authur is played by the folks at Six Apart who
make TypePad and MovableType. A lot of people, me included, think these
are good people.

Here’s is how the bounce is implemented. Victoria
bounces Fido over to Authur using a URL like this:

   https://www.typekey.com/t/typekey/login?t=<blog_token>&_return=<return_url>

This in effect says “Yo, Arthur? – You know this dude? – Victoria.”
or in this example it’s “Yo www.typekey.com? – You know this dude? –
<blog_token> <return_url>.” The return_url is where the Arthur should
reply to. Blog_token is very a very obscure thing
(e.g. “LjRd2DpifL51sB0iFeYT”, or “R234”), but it’s just Arthur’s pet
name for Victoria. That token is meaningless (opaque) it’s used only
in conversations between Arthur and Victoria.

   https://www.typekey.com/t/typekey/login?t=R234&_return=http://viki.com/http://viki.com/handler.cgi?a=_return

Arthur won’t answer these question from just anybody. He only answers it
for folks like Victoria he has a good relationship with. So that pet
name is important. Arthur’s going to look up the status of Victoria’s
relationship with him using that.

Before he replies he’s likely to do a couple things. For example he’s
likely to check if Fido is ok with this automatic authentication
bounce back thing. If Fido’s not approved that Arthur might tell
Victoria that she needs ask Fido to let down his privacy guard a bit.

Arthur is also responsible for being careful to check that he doesn’t
answer questions from Mr. Evil just because he happens to know the
Victoria’s pet name. One part of that is to only reply to a return_url
that has was explicitly registerd by Victoria.

Of course Arthur’s also going to check and see if this is Fido.
Recall that we are actually bouncing Fido’s web browser over to
Arthur; so Arthur can use cookies he left there to figure out if he
knows this dude or not.

Assuming Arthur can convince himself that this is Fido then he bounces
Fido’s browser back to Victoria using a URL that looks like this:

  <return_url>&ts=<timestamp>&email=<hashed_email>&name=<name>&nick=<display_name>&sig=<crypto_signature>

For example:

  http://viki.com/handler.cgi?a=_return&ts=1294252341&email=adg3adfa3&name=Fred&nick=Good+Dog&sig=vuXeBVRJG2cR4xl81+HoeJMbKYs=:DBASoTXIQtlxs07jRblTLRk=

I don’t think much of this reply’s design. Let’s pick it apart.

In the trade this reply is known as a “statement” or an “assertion.”
It is the letter of introduction of old. Like that letter it is
signed, by Arthur, so it’s recipient Victoria can know who is making this
statement.

The entire assertion should be encrypted, as it is the statement is
revealing things about Fido to folks listening in on the conversation
between Victor and Arthur. That’s just rude of the them. In the best
of worlds this would be sent using https. That would require that
Victoria have a https server. That’s a pain. But, if Victoria can
check that signature then she can probably decrypt the reply packed
into the message.

You can see what Arthur has decided to tell Victoria about Fido. Three
things: “name,” “nick,” and “email.”

I don’t know why it’s good to tell Victoria all this info about Fido.
The usual argument is people make for such revealing is that it allows
Victoria to fill out forms for Fido faster; giving him a more pleasing
user experiance. But shouldn’t that be up to Fido? In the example I’ve
shown above they have revealed that Arthur happened to know that Fido’s
name is Fred; and that he has used the nickname “Good Dog.”

I don’t see why that was any of Victoria’s business, since at this point
Fido and Victoria have just met and all Victoria needs to know is if it’s ok to allow Fido to post a comment on her blog.

The email address isn’t really an email address. It’s another bizzare
token: “adg3adfa3” (actually it was really:
“a4426c6a28b21941e3de8b14541e10e5aabb24e8”).

Most people’s first reaction to a jumble like that is “ok that’s safe.”
It’s a good habit to get into to then think. “I wonder what database
that might be a key into?” or “I wonder what message is encrypted in
that?”

In this case that’s the key for looking up what’s known as a FOAF file.
FOAF is the mnemonic for the cheerful sounding “Friend of a Friend” file.
Recall at this point that “cookie” sounds nice too; but that doesn’t
prevent them being used for evil.

Recalling out letter of introduction example the reason Arthur
included that in his reply to Victoria was similar to writing “If you
want to know more about my friend Fido, take a look here.

FOAF files are cool. For example
here’s a view of my
FOAF file
. I decided to reveal the information you see there. Did I get permission from all those folks (my so called “friends”) in there to reveal their names? Nope. That was kind of rude of me.

That complex looking string is known as a “mbox_sha1sum”, it is based on an
email address, but let’s call it the foaf universal identifier. Or maybe we
should call it the universal id. Or maybe we should just call it the devil’s
mark. The problem is that one you know somebodies’ mbox_sha1sum you all
set to go looking for their FOAF file in any database that happens to have
it.

This is the same problem that arise if you have a national ID number. It
makes it far to easy for data about an individual to be sought out from
multiple sources. It far to likely that if a database of personal data is
leaked by mistake that Mr. Evil will be able to tie that data to other
data from other sources. It encourages identity theft.

Arthur and Victoria should create a private handle that they use to discuss
Fido. I notice that used just such a private handle when for Victoria
when they began their converastion about Fido; what I called a “pet name” above. One step in that direction would be to use a one off email address for Fido that Arthur gives him, that would help avoid revealing Fido’s personal information stored in other FOAF files. That would help a little.

But the fact is that Fido may want to reveal somethings to Victoria and other things to other folks. That requires an entirely private handle for Authurs discussions with Victoria about Fido. Without that is it possible for Victoria to trade private information about Fido with others who interact with Authur. That’s exactly the problem that cookies enabled. It’s what allows a company like double click to track your web browser as it travels over a set of sites of their clients to create a more complete model of you. But this is somewhat worse because Arthur has enabled the trading of information about Fido and he doesn’t any means to control the situation going forward.

There is one additional thing that bothers me about the TypeKey design.

But first I want to make perfectly clear that I believe that it is not practical to build useful authentication systems in the web without encouraging folks like Arthur appearing on the scene. Both Fido (aka Fred) and Victoria need trusted third parties to help make introductions and lubricate the flow of that information which they want to reveal. The design challenge is to be sure that we don’t encourage the emergance of a single Arthur. We want to encourage a huge number of them.

So the last thing, or maybe the first, that concerns me about the TypeKey design is that it lacks a mechnism to encourage more than one Arthur’s emergance. That’s not that hard. It just demand that when Victoria starts the conversation with Arthur she has a way to discover which Arthur(s) Fido would prefer that she use. She can then select one or more that she trusts and proceed from there. But, you got to get that right up front.

Other junk about identity that I’ve written.

Tearing off the disguise

Clay brings to our attention this April Fools Joke.

This joke proceeded in phases. First the prankster hides his true identity. I.e. he made him self more anonymous. Then he entered a number of communities where he reveal himself to be a person compatible with those communities. This was lie. As a gift he offered the folks in those communties the chance to display a badge on their sites in support of an issue they cared about. Many took him up on his offer. The trap was set.

Then on April Fools he changed the badge to one that was in opposition to the issue these folks cared about.

Clay was interested in how he was able to toggle from fragmented intimate activity into gobal broadcast activity.

I’m interested in how he was able to manage his revealing to toggle: first between his real identity into an anonymous one, then into a false fragmented one, and finally suddenly revealing a real one again.

I’m also amused by how this joke is structured. There is a whole kind of humor that depends on the punch line, the sudden revealing of the truth. Ironic humor depends on allowing your audience to know the truth as the scene plays out with those on stage blind (willfully or otherwise) to that truth. Sadistic humor does that with an element of cruely. Sarcastic humor teasing of those on stage. This prank has them all.

In a stage play this device is used to amuse the audience. The character of the pompus strutting husband made clear to the loving wife. The loving wife would appear in disguise. The audience would enjoy watching her observe, or even flirt with the pompus husband. Of course in the end the wife would rip off her disguise and the audience would all smile at the husband’s discomfort.

This joke works, at least for some, because a large portion of the prankster’s audience considers the groups he mislead to be full of themselves.

It’s also a great example of the name/thing problem. The entire prank depends on it’s victums posting the badge on their sites using a URL that fetches the badge from a site the prankster controls. That URL (universal resource locator) is being used as a name or identifier for the badge. When our prankster can switch the content the loose coupling between name and thing becomes very clear. It’s a minature example of idenity theft, i.e. the identity of that URL was hacked.

A great example of identity hacking.

We know where you live.

05REASON.jpg

Ha! What a stunt. Recipe:

  • One mailing list.
  • One geographic information system with satilight images.
  • One one high speed custom printer.
  • One magazine for self absorbed selfish people.
  • 40 thousand subscribers.

Custom magazine covers. Each cover has a photo of the subscribers home on the cover.

Story in the New York Times, requires subscription.

This reminds me of a product that Lotus tried to sell back in the mid 80s. It consisted, at it’s core, of a CD with most of the nation’s population on it; along with estimates of thier salary and other factoids. Such data is widely available and cheap. The outrage kept Lotus from selling the product … widely.

I wish I could, but I certainly don’t, see how you can regulate these kinds of activities. It is so clearly rude; but what to do?

(thanks Brian)

Serving up the self.

Back to the Internet identity problem; or how we solve to solve the problem of letting users safely and easily reveal personal information, as well as the problem of how that interacts with various authoritative entities.

In the real world we reveal substantial amounts of information casually – hair color, language, approximate age, various fashion statements. We adjust that revealing dynamically. We wear different clothing in different contexts. For better or worse the folks around us use this revealed information to assemble a model of who we are. They create a model of our identity. The revealing is feeds into a dialog about their model of us. That model then feeds into our relationship the one between us.

In a commercial context we might call those relationships “accounts.” In the marketing literature they even talk about relationship marketing. Doc. Searls makes the point that as the Internet shifts the balance between them and us markets become conversations. Solving the identity problem is all about finding a way to enable that conversation to proceed in a practical, useful, and safe manner.

In the Internet no standard methods exist to help users begin the conversation. When I walk into Amazon, or eBay, or a mailing list I am naked. Actually it’s worse than that. I’m actually so thin a presence that I don’t even have a place to hang a few rags that might help me to project a persona.

Dog wearing a tie.
That didn’t happen by chance. Fixing the identity problem demands that we tackle two very hard problems (revealing and authorities). These are so hard we have kept pushing the problem outward. At each round in the design game it has been easier to just minimize the revealing and push the authority problem out to the network’s edges. That approach has its benefits. Nobody knows you’re a dog, America is all about second chances, etc. etc..

But I want to be wearing clothes when I go out. So today’s puzzle is to look at what a revealing mechanism might look like, from 40 thousand feet.

I find it amusing to say that what’s needed is a “self-server.” A place we can refer others to. Want to learn about me, go here. This place enables them to have a dialog about who I am.

Let’s look at a scenario.

I visit a Wiki. I want to contribute some fresh content. The server running the Wiki wants to check out my reputation before it lets that happen without moderation. So it turns around and via some Rube Goldberg device it manages to get access to my self-server and using that it asks about my reputation as a contributor of reasonable content. Since Mr. Wiki doesn’t trust me, of course, it’s pleased when the self-server recommends an authority that the Mr. Wiki can trust. Ask them Mr. Trusty says. Of course when asked Mr. Trusty (a reputation authority) says very nice things about me! Happy day, Wiki server let’s me post.

Clearly a lot of details are getting glossed over here. But notice that at no point was anything revealed about me except my reputation as a contributor of open content. The Wiki didn’t learn my name, or my social security number, or the name of my dog. Presumably before I’d allow my self-server to do that I’d have to grant permission.

If this were 1980s the self-server would be documented with an RFC and a simple protocol. In the 90s we might have used a web page, and if we wanted to encourage a dialog it would have included a CGI script. These days we call it a web services listener. “Whatevvver.” We modern guys so we can assume lots of encryption and opaque identity tokens etc. etc.

In the Liberty Alliance design this thing I’ve called a “self-server” is a set of web services spread out all over the Internet. That allows, in our example, lots of posting reputation services. It allows many many authorities of varying kinds and seriousness. Your bank, your house, your hairdresser, your clubs can all play a role. That’s as it should be. We certainly don’t want to create one grand central authority.

The only “central authority” they need to rendezvous with is the standard protocols. Those protocols break out into two layers. One is generalized glue that helps you find various kinds of services. The second is a suite of services. Once you get the first layer right then any number of members can pile onto the suite of services. If you want to have service that helps users reveal their fashion preferences then have at it.

Well actually there are three layers of protocols because we also need to design the Rube Goldberg device that helps you find the self-server-directory starting from what ever is available in the legacy protocols.

Revealing, Authorities, Identity

Open systems thrive when contributors are encouraged to toss a little value into the pot. Stone soup and all that. Of course we all know that there are bad actors out there so if you run an open server your going to get evil contributions from bad actors. Open up your email address, your cell phone, your cooperative wiki, your blog comments and some twit is going to show up sooner or later posting porn, selling unregulated herbal remedies, and spraying rude graffiti on the walls.

Three solutions:

  • Authorize only trusted contributors.
  • Moderate contributions via some trusted mechanism.
  • Maintain, i.e. trusted folks fix the damage after the fact.

So it’s all about trust, authentication, etc. etc.

Problem is that we only know how to solve that problem in one way. The contributor must reveal something to us, so we can authenticate him, and we need to have some central authority that vouches for the guy.

Those two terms are trouble:

  • Reveal
  • Central Authority

Those two goes a long way toward explaining why the Internet identity problem is so hard. On the one hand the solution needs to enable revealing (which sounds a lot like privacy intrusion, embarrassment, and identity theft) and on the other it needs to enable the emergence of central authorities (which sounds a lot like authoritarian police state, abusive monopoly, and single point of failure).

This stuff just doesn’t conform well to the end-to-end principle. Sure, sure, you can run your own “authority” out at the edge. Your blog, wiki, email client, can sit there infer the trustworthiness of your contributors from various implicit and explicit info. That’s all well and good but it’s socially dysfunctional.

People are social creatures. They project a personality. They manifest assorted behaviors and attributes so that we can construct a model of them. A world where everyone is totally anonymous is just bizarre.

Worse is that solving this problem at the edges means that trust and reputation isn’t fungible. Not fungible means lock-in and hence increased power-law reinforcement. Not good. Why spend time contributing to dinky open source project when you could spend the time contributing to famous project and hence gain a reputation that’s “worth something?” Let’s say it takes 10 good postings at Bob’s community before I’m allowed to post without moderation and join in the discussion. Once I’ve climbed over that barrier why would I bother to go join Sam’s community?

This doesn’t make for an open system, this makes for a compartmentalized system.

Now I’m not arguing that reputation can be fungible like ounces of gold; but I am arguing that a design that declares that reputation shouldn’t be fungible from the get go is wrong.

So this is the rub. We have a problem that demands that we figure out how to design the central authorities in a manner that avoids them becoming too powerful – we know a lot about that. We have a problem that demands that we empower the users to reveal what they wish to reveal.

It’s just blind foolishness that pretends we can design a system with no authorities and no revealing. Worse is to believe that systems of that kind encourage a more open system. Open system thrive on having complex porus membranes. No membrane is fatal.

Liberty Momenteum?

I consider this a very health sign that the Liberty Alliance specs for solving the problem allowing users and firms to manage the authentication, data sharing and privacy problems might just be getting some traction.

Automaker General Motors Corp. has just completed a successful pilot with a 401(k) retirement-services provider as part of a plan to link its employee intranet portal, called Socrates, with a variety of employee-benefit service providers, says Rich Taggart, director of enterprise architecture. The technical part, based on Liberty Alliance specs, wasn’t hard. “That part of it went rather well,” he says. “Everything is very interoperable.”

Momenteum is a key aspect of an emerging standard. While there are lots of phases in building the momenteum of a thing like this. For example getting big names to throw their reputation on the bandwagon. Short of standardized-exchanges/second the most significant sign is actual adoption. So it’s very encouraging seeing people talking about that.

The real challenge was getting both companies comfortable with the idea of linking their systems, so when an employee logs on to the GM portal, he or she is automatically able to access the financial-services provider’s systems as well. There were lots of discussions over “the comfort level of doing this and how it could affect the business relationship and relationship with the employee,” Taggart says. There were lots of conversations between legal groups, he adds.”

Indeed. This problem is very hard and deserves a good open solution. I am quite encouraged by the Liberty designs. We may actually get something that addresses everybodies concerns.

Full disclosure: I have had a tiny part in the Liberty project.

Out of Office

I sent email to a large private email list recently. I have gotten 20 “out of office” replies. This is dumb. The out of office robot ought not reply if the sender isn’t one of your correspondents for some reasonably definition of corresponent. How hard is that? For example if your have never sent X email and never replied to X’s email, and never even participated in an email thread with X, and X isn’t in my address book – what does this robot think it’s doing send X details of your personal life? There must be some really amusing stories of email robots revealing personal info.

Social Networking Cartoons Chapter II

I’ve ranted before: people are missing the point of what’s going on in the explosion of the social network web sites.

But wait! There’s something else to be said. This industry, the social networking industry, has another classic pattern.

Markets tend to sort out into a population of players distributed along a power-law curve. Some industries have greater or lesser slopes on those curves. For example: while there are many independent garden centers there are fewer and fewer independent hardware stores. Michael Porter has a list of some of the reasons this happens.

How with the social networking sort out? Lists’ like Porter’s give you hints. They help forecast the future. They tell you what levers to try and pull if you want a particular outcome.

If there are strong forces that push us toward a single huge social networking sites then we are in for a bit of serious competition. If there is only going to be one then that’s a very valuable peice of real estate.

We are seeing a lot of these sites because of the low barrier to entry for creating one. If you can lower that barrier even more then you will get more.

The driver toward a single sites are always scale advantages. It is a huge pain for a person to maintain N such sites. If most of your community of contacts gathers at one site then there are strong network effects for you to settle in there and just ignore the rest of the sites.

This combination of a low barrier to entry and a strong network effect makes for a particularly high stakes game. You can get into the game cheap. The winner may carry off a huge prize.

I certainly hope we don’t end up with one site. That barriers to entry can be made even lower with open source. The network effect can be tempered by with open standards for linking these sites to each other. I certainly hope that happens; because it would be a very weird outcome to discover that almost all the clubs and organizations on the planet start pitching their tent inside one firm’s walled garden.

The forces that push industries toward one outcome or another are not invisible, they are not blind, and they are not inevitable. They are the consequence of the actions taken by the participants in those markets both individually and cooperatively.

Complaining about how autistic the social networking sites are misses the point. The key question is: what shape do you want the market to take?