Archive for July, 2004

Lying with Statistics

Saturday, July 31st, 2004

… Mr. Bush said Mr. Kerry had been rated “the most liberal member of the United States Senate.”

“And he chose a fellow lawyer who is the fourth-most liberal member of the United States Senate,” Mr. Bush went on. …



   – New York Times

Piffle!


This illustration shows one dot for each senator in the current Senate. The blue dots are democrats, the red dots are republicans. There is one green dot toward the left for the one independent senator. The dots are sorted left to right to illustrate their ranking liberal to conservative. The data is from the gold standard in such ranking.

Senate108.png

Edwards score on this chart is 19.5; Kerry’s is 24.5; and Bush isn’t a lawyer only because the University of Texas declined to admit him.

More

Groupthink

Saturday, July 31st, 2004

“Groupthink” appeared a lot in the coverage of the recent congresional commission reports. I hadn’t heard that term for 20 years; which sent me off to see from whence it came. Irving Janis’s work on fiascos and decision making apparently. Why did it fall out of common usage? Why’s it making a return appearance?

My insta-theory for it’s disappearance is that Americans don’t like to admit that groups have any control over our behavior and we particularly don’t like to admit that this control might be unconscious and unspoken. For example, if you suggest that people are sometimes manipulated by cults most people will argue either that it was the controlling abusive influence of the cult leaders or that the victims were weak and/or willing.

Janis tries to avoid studying the fiascos that come out of totalitarian or other highly controlled groups. No surprise when such groups fall victim to narrow minded problem solving, blind spots, and fiascos. Janis is looking at the much more interesting cases. Group of well meaning highly capable people that fall into highly controlling modalities of behavior patterns. How’s that happen? Collective behavior.

It is amusing to contrast this with the work on how complex behavior emerges from the combination of simple bits and pieces. The behavior piles of sand, or groups of insects, or simplistic electrical networks. These days we have dozens of examples of systems where seemingly globally coordinated behavior emerges from surprisingly primitive individual players. For example: that a field of fireflies can synch their flashes with only a few neurons. An entire national electric grid can collapse given the failure of a handful of elements.

It’s not a pretty thought that groups of highly capable people might fail in similar ways. So people decline to embrace the idea that such modalities might arise in human groups. It’s not a very empowering idea. If nobody is conscious of it then who do we blame?

This kind of model is very popular in one scenario though. After the fiasco, when the group writes the report that explains what went wrong. This framework is just what the doc ordered. It offers the people writing the report a chance to avoid blaming anybody.

I’m not surprised that the Iraq commission fall upon this diagnosis.

Consider Janis’s model of the antecedents of group think.

  • Illusion of invunablity, shared by most or all the members, which creates excessive optimism and encourages taking extreme risks;
  • collective efforts to rationalize in order to discount warnings which might lead the members to reconsider their assumptions before they recommit themselves to their past policy decisions;
  • an unquestioned belief in the the group’s inherent morality, inclining the members to ignore the ethical or moral consequences of their decisions;
  • sterotyped views of enemy leaders as too evil to warrant genuine attempts to negotiate, or as too weak and stupid to counter whatever risky attempts are make to defeat their purposes.
  • direct pressure on any member who expresses strong arguments against any of the group’s stereotypes, illusions, or commitments, making clear that this type of dissent is contrary to what is expected of loyal members;
  • self-censorship of deviations from the apparent group consensus, reflecting each member’s inclination to minimize to himself the importance of his doubts and counter arguments;
  • a shared illusion of unanimity concerning judgments conforming to the majority view (partly resulting from self-censorship of deviations augmented by the false assumption that silence means consent);
  • the emergence of self-appointed mindguards - members who protect the group from adverse information that might shatter their shared complacency about the effectiveness and morality of their decisions.

Fits the Iraq fiasco like a glove doesn’t it!

Research like this can be empowering. Once you identify the ailment then you can begin to look for symptoms, treatments, cures.

Laying the blame at the feet of groupthink just doesn’t cut it. Sure, groupthink is a failure mode of decision making groups. There are plenty of such failure modes! A whole laundry list in fact. For example overreacting to one constraint in the problem space and thus becoming blind to all the other constraints.

Competent people know how to reduce the severity of these failure modes. That’s what you get when you put “proven executives” in charge. While you can’t expect each and every person in a group to have this kind of expert knowledge - the knowledge of how to navigate around all the failure modes of group problem solving - is it too much to ask of the groups assembled to tackle the top few problems facing the nation?

Reputation and Privacy

Thursday, July 29th, 2004

Joe’s privacy erodes when two or more parties conspire to merge their model of him together. To frustrate this we have a design principle: avoid handing out a unique identifier. If the library and the video store both index their records by my national id number then the only barrier to merging those records are current law and rapidly evaporating technological inconvenience.

For example I have a number of standing queries at Google. This morning one of these reported that a member of my family seems to have an account at an a certain site; clicking thru revealed a list of the content they had viewed at that site. In this case the unique identifier in my query was their name. Somedays these problems seem a little hopeless.

My idea that groups could empower their members by a letter of introduction scheme to create anonymous persona is one attempt to tackle this problem. Key, in my thinking about that idea, was that the persona would be bootstrapped by the reputation of the group and the user/member of that group.

The letter of anonymous introduction is means of narrowing the model of a user. Creating only a fragment of reputation that can then be used to gain rights in some other community; say the world of open systems.

It’s interesting to note that user model isn’t exactly the same thing as reputation; it’s only a kind of model. Is it the kind of model that creates rights?

Generally reputation accumulates for a persona based on a history of transactions involving that persona. The video store records are an example of just that kind of transaction records. Statistical summaries of those records - stating derived facts are less revealing. So if we know that the 20% of the videos weren’t returned on time we know something different than what we know compared to knowing that 82% of the movies rented were rated PG.

While the letter of anonymous introduction is an interesting edge case all these distillations from fully fleshed out transaction details into statistically abstractions have some slight element of increasing the privacy of what’s getting revealed about the persona involved.

This is similar to the privacy schemes used in the census data that reveal aggregate statistics for a region but not for a household. It’s a step in the right direction.

All this gets me thinking that there might be a middle ground regarding the unique identifier question. That there might be schemes that allow roughly unique identifiers. In a sense that’s what the anonymous letter of introduction scheme is creating. It a means that allows the creation of a persona that says no more than this is Mr. X of Group Y.

Regulating Hearsay

Wednesday, July 28th, 2004


As outlined in this model one way to look at the identity problem is that information about a user flows out from their activities and then is passed - behind their back so to speak - to third parties that then aggregate models of the user. Those third parties then sell those heuristically built models to fulfill the demand for a better user model.


That process makes users reluctant to reveal information to other parties. They have no assurance that the information they reveal will “go no further”.


At the nub of this problem is the contract, implicit or explicit, that governs that revealing. For example if the user could get a signed non-disclosure agreement each time he reveals some information his confidence that his privacy could be protected would increased. The identity business could help with this.


This is one reason that corporate people often say that technically the identity problem is easy, but the business agreements are very very hard.


Today users contractual deal with the web services they reveal information to are governed by the privacy policies of those sites or the PPA (Privacy Protection Acts) of the relevant industries. The protections afforded are typically very thin. Web sites want to keep their options open. Web sites can’t afford to negotiate a distinct binding agreement with each and every user.


It isn’t practical to assume that one privacy contract/regulation will cover all cases. I want a different one for my home address, my phone number, my library records, my academic records, my performance reviews etc.


Problems of this kind can be solved in only two ways. You can set a very low baseline and then add protections; or you can set a very high base line and then open permissions. In a safe environment you can adopt the first design pattern, in a risky environment you must adopt the second.


This is possibly the hardest transition that must be managed in solving the identity problem. How do we bootstrap a very strict contractual/regulatory center piece and then empower the users and web sites relax that for individual cases.


Consider the FOAF directory service outlined in a previous posting. Could we write a contract we bind the web services to that strictly limits what they can do with the information they glean from the FOAF we helped them find? Would that frustrate any attempt to get the system to grow?

FOAF magic

Wednesday, July 28th, 2004

Often when I encounter tech-savvy folks and I mention that I spend a lot of my time working on the identity problem they mention FOAF. FOAF, or Friend of a Friend, for those unfamiliar is a spec for how to write down a mess of information about a person. It uses XML, well actually the RDF variant of XML. Oh, and it can be used for any entity not just for a person.

FOAF is a nice format to solving various problems about how to reveal information about a person. It doesn’t try to tackle other issues; like privacy and it’s complement: how to broker the data exchanges. It is a pretty good solution to the question: “What might a user model look like.”
Thus if you embrace the magic happens model of this identity business FOAF is an acceptable first draft of what we want the magic to return so the web site can begin to make a better experience for it’s visitors.

If we cheerfully ignore issues of privacy then the magic we are seeking is “just a directory” problem. The web site queries the magic and get’s back a pointer to the user’s FOAF file.

You could thread that needle with web bugs, shared cookies, or redirection bouncing.

For example: Redirection bouncing works as so. The user visits the web site. The web site asks the user’s web browser to redirect to the FOAF directory service. That service then redirects the user’s web browser back to the web site passing a pointer to the user’s FOAF file.

The FOAF directory service has a relationship with site and users. For the web sites it provides the magic. For users it provides the service of making their FOAF easier to find.

There are huge number of issues with a service like this before it becomes useful. Here are a few examples:

  • Achieving scale.
  • Rate limiting the sites.
  • Guarding user privacy.
  • Avoiding incidental revealing to incidental observers.
  • Efficent authentication of sites using the service.
  • Denial of service attacks.
  • Tempering the problem that a browser is not identical to a user.
  • Avoid enabling gossip like consolidation (without permission) of joined user model.
  • Tempering the concentration of power such a hub implies.
  • Enabling distribution, extensibility, and distribution, and compartmentalization of the user model.