Monthly Archives: March 2004

Talent Scrapping

I’m fascinated by the universe of systems that capture a little value from a very large number of people and then aggregate that into something of high value. The list of such systems is huge, just for example:

Open Source Projects.
Amazon reviews.
The original oxford english dictionary.
The internet movie database.
CMU’s clever picture categorization game.
This trick for fooling a Turing test.
Slashdot
Blogger, Live Journal, Friendster, Orkut
Netnews, Yahoo Groups
Google
the web
this semi-porn site
This toy.
Society
del.icio.us
Everything 2
Wikipedia
Sheep Market

Here’s a fun little idea I had today along these lines. There is a general classification problem about dialogs. For example say you wanted a classifier for angry dialogs. How could you get such a thing?

Well you could create a Bayesian filter and let a large population of volunteers train it to do the classification. This is the pattern above of a large number of contributors summing up into an aggregated thing of value. In this case the thing of value is a classifier that can recognize a class of text or dialog.

For example here we have a guy who hand built a system to entrap child molesters in online forums by posing as a child. I suspect it would be easy to build a classifier that could monitor dialogs in such chat rooms that did the same thing, and it wouldn’t be hard to find the transcripts to train it with.

It would be very interesting to try this on the messages in a mailing list creating a set of classifiers for various kinds of speech acts: baiting, constructive, helpful, query, answer, discussion, debate, argument, etc. etc. I bet that the spooks have such systems but I wonder how they built them.

I keep trying to find a good name for this class of systems. Brain Farming? Talent Scraping? Help Hoarding?

Lost their bottle?

Leave a reply

“But these figures do seem to seriously undermine the slur that the Spaniards
lost their bottle after the bombs and decided to cave into Bin Laden.”

What a great insult: ‘Ah, you’ve lost your bottle.” I assume it’s origin is in a sentence like “Oh, poor baby! Did baby lose her bottle. Here here, let me help you.”

Portknocking

Leave a reply

Port knocking is a trick for adding additional security to your machines on the public internet. The idea is that only after a peculiar series of packets appear on your machine’s external interface to you then lower your firewall and begin listening for connections to a given service. For example only after somebody attempts to connect to port a,b, and d do you then start listening for ssh connections.

I don’t get it. Why this is such a great idea? I often configure machines so that certain listeners are only available after something else happens. For example after a certain email is recieved or after a particular http request happens takes place. I often configure a email address that takes signed pgp encrypted messages, decodes them and executes the commands found there in; a glorified batch Q. I’d much rather have the battle tested smtp or httpd server exposed to the outside world than a cool new innovative port knocking demon.

Of course it should be pointed out that all these techniques work best if there is a gloss of challenge response or public/private key usage laid over the mechinisms.

WMD: Warming of Mass Destruction

Leave a reply

It would be sad if my blog became entirely political over the next few months, we shall see.

A friend of mine (Hi Claude!) points out that the Bush national security policy that holds that we can’t wait for a clear unambigous signal that X is a threat before we apply a can of preemptive wop-ass to X clearly leads to the conclusion that now is the time to get down and dirty on Global warming.

I suspect that this only goes to show that the current national security policy is missing the footnote explaining that these rules of engagement only apply if the experts happen to agree with our preconceived notions on the topic at hand. Actually I’m not sure that experts are required, just notions.

The picture at right shows the increase in vegitation over the last 20 years in the area north of the latitude 30 degrees north.

What if the Turing Test has a social network?

Leave a reply

Via David Chess we learn that evil web spidering robots have found a way around yet another security device. The CAPTCHA system works by showing the visitor a picture, see at right, and then asking them to type in the word hidden in that picture. It’s a Turing test of that the visitor has at hand a human visual system.

The robot’s authors hacked around this by having the robot turn around and ask it’s human friends to answer the question for it.

But at least one potential spammer managed to crack the CAPTCHA test. Someone designed a software robot that would fill out a registration form and, when confronted with a CAPTCHA test, would post it on a free porn site. Visitors to the porn site would be asked to complete the test before they could view more pornography, and the software robot would use their answer to complete the e-mail registration.

This reminds me of a story I heard a long time ago about how the SAT folks became curious about the high scores at one school. They traveled out there to discover that they class was solving the test as they did all other problems in their community, cooperatively.

Choice

Leave a reply

A couple notes on choice.

This is a nice review of a book I need to read on the excess of choice in modern life. One story from the review: many young people arrive in their thirties having failed to make any choice about their line of work. The modern world both does not constrain their choices. They are taught to value above all else a diverse portfolio of options, i.e. freedom. They haven’t specialized. They have no depth of expertise.

Then on NPR I hear this story. This guy took an oath to surf every single day. He’d taken the oath last time there was a February 29th on a Sunday and he swore to continue until the next time it happened again. Seems like a reasonable work around for the problem of excessive choice.

This is one of my primary interests, i.e. the question of loyalty. Consistent behavior is one of ways we provide a model for others that they can react to. Consistent behavior is one of the ways we simplify the near infinite cognitive load of moving thru day to day life. Consistent behavior is one of the corner stones of durable collective action. Consistent behavior is what gives us culture, structures, standards. Consistency is the means we use to lower negotiation costs. It’s just too bizarre to pretend that the collective society could be renegotiating the entire social contract every few moments.

So I’m more than a little surprised that apparently philosophers have settled into the presumption that to “honor sunk costs” is a fallacy. Seems to me that the philosophers have been become a bit too loyal to the catechism of the church of portfolio theory.

Finally I recall that when I visited Ireland one of the citizens cornered me wanting to know what I thought of the high divorce rate in the US. Much later I learned that the percentage of folks who are married in the US is substantially higher than that found in Ireland. (You may consider all this to be hearsay.) A fact that reminded me then and now of how the absence lack of a dominate church in the US seems to create a higher level of church membership than that found in countries with a more centralized church.

My take on all this is that clearly if you limit choice substantially (demanding very high loyalty) you get rigid cartoons of real groups. If you create extremely high levels of choice you get very transitory groups that never achieve any depth to their collective activities. It’s one of those not to hot not to cold problems.

Bonus link. It appears that German has a word for Slovenly Peter.

Terror, Spain, Elections

Leave a reply

I do hope that nobody models the example presented by the terror and election in Spain. While it seems implausible that anybody could have predicted before hand the consequence of the terrorist act upon the election. Now that we have one example people maybe tempted to over generalize.

I wonder if there are procedures that could be introduced to buffer an election from such attempts to manipulate it. I can’t think of any off the top of my head.

Bleck.

Eroding the Garden Wall

Leave a reply

Let me take another try at this problem of why firms might need to commoditize their own markets.

It appears to me that the high value is found in solving hard problems. The way we solve problems is that we bring together a collection of component bits that were not previously integrated. As we do this we complain: the design should be more elegant, the parts more modular, guiding principles of the design less random, the coordination problems less tedious. These complaints are the symptoms of the hard high value problem.

As times passes the solution to the hard problem becomes better understood. Any numbers of processes drive this transition. Design knowledge emerges and becomes widely known. Workable modularity is distilled out of tangled spaghetti like systems. The existence of exchange/interaction bottlenecks encourages standards to emerge. Rituals that lower coordination costs become better known.

This shift in a given problem domain from hard or tangled problem solving toward easier more modular problem solving causes two complementary shifts. The modularity enables a substantial increase in the size of the market along with an associated increase in total value generated. At the same time it knocks down the garden walls around the firms that built highly integrated solutions. It wreaks havoc on their discriminatory pricing.

All this is consistent with the insight that discriminatory pricing is likely when a firm

Out of Office

3 Replies

I sent email to a large private email list recently. I have gotten 20 “out of office” replies. This is dumb. The out of office robot ought not reply if the sender isn’t one of your correspondents for some reasonably definition of corresponent. How hard is that? For example if your have never sent X email and never replied to X’s email, and never even participated in an email thread with X, and X isn’t in my address book – what does this robot think it’s doing send X details of your personal life? There must be some really amusing stories of email robots revealing personal info.

Info Axioms – Metcalf’s Law

Leave a reply

Via Sam we find this site that is attempting to collect a set of fundimental information axioms. A very impressive set of people are involved. This enterprise is reminisant of the wonderful book Information Rules as well as a number of other efforts such as those mentioned here.

I’m finding it fun to treat these axioms as a list of homework assignments. I.e. “Critique axiom #N”.

For example here is their first axiom:

Axiom 1 – Metcalf’s Law

If there are n people in a network, and the value of the network to each of them is proportional to the number of other users, then the total value of the network (to all users) is proportional to n X (n-1) = n² – n (Shapiro and Varian, 184).

Member aggregation is more important than the type or amount of resources owned (Hagel and Armstrong, 14).

Recognizing that a network’s value might in fact rise O(n²) is another way of saying that there is a scale advantage to any system that has network nature; and it tries to give the reader some intuition about the magnitude of that advantage. N² seems like a lot of advantage.

Most of use encounter scale advantages first on the production side of things. That a large car manufacture is more efficient than a small one because he can share certain fixed costs (R&D, administration, accounting, whatever) more widely. Of course that kind of scale advantage isn’t N^{2; in fact it tends to drop off so that after a while the benefits that acrue from merging to huge automakers are pretty minimal.}

What has come to bother me about Metcalf’s law is that I suspect there are similar limits to it’s power; limits that arise from the limited attention and limited flexiblity of the parties around the edge of the network. Each time you double the population on the edge of the network it will take me some time to find out that some of the new members should replace my current correspondents – since they are more valuable correspondents than my olde ones. The new participants are, in effect, a sample of the universe of all possible correspondents. Additionally is there an arguement along these lines: a some point the sample becomes large enough that new samples do little to increase the value of my total set of correspondents.

My second concern about Metcalf’s law is revealed by Reed’s Law – i.e. that the issue isn’t how many correspondent pairs are enabled by the network but rather the number of groups that the network enables to form. That number is truely mind boggling large. So large that if these groups are actually able to form it suggests that the network is assured of triggering a total denial of service attack on our attention. A syndrome I’m sure some of use have noticed from time to time.

I find Reed’s law a much more convincing model than Metcalf’s. I find it a much more compeling model because I’m convinced that groups (i.e. graphs) are a more useful unit of conceptualization rather than links. Of course it too must have scale limiting syndromes that need to be understood.

My third problem with both Metcalf’s law and Reed’s law is the manner in which they gloss over the issue of what the hell is do you mean “value.” Value doesn’t exist in the abstract. Value only exists in terms of some constituency.

If a conversation takes place, intermediated by the network, between P1 and P2; the value created there may accrue to P1, P2 or it may accrue to any of the entities that deploy the network N1..Nn; or it may accrue to the groups G1..Gn that enclose those players. For the economist gazing down from his ivory tower it maybe all well and good to just say “the economy” nameing thus one of these groups; or maybe te union of these groups; but down in the trenchs these issues are central to problem at hand.

For example you can design a network that encourages value generation for different ones of those entities. It’s not good enought to just say the network will have value and there fore we should go get one. You really have to make choices about how to encourage or channel the value that emerges.

In the end that’s what’s wrong with Metcalf’s law; important as it is. It implies that the value arises in the pairs – rather than say the groups. The model of value colors how think about the network; and if you get it wrong you’ll be surprised in unfortunate ways. For example I think Metcalf’s law is about right for the telecom network, while Reed’s law is closer to right for the Internet. That one key reason why the telecom industry is burnt toast.

But, not to put too fine a point on it and remembering this is Axiom #1 for God sake, I think that Metcalf’s law get’s the order of the overall value too low, and that Reed’s law get’s the order of the value too high. That the right model is one revealed in the power-law distribution that emerges from networks; and that model’s controling terms are those revealed in the slope and bounding box around that curve. Somebody more mathematicly inclined than I will have to frame that in a manner that competes with Reed’s law and Metcalf’s law in on their own modeling terms.

Ah, that was fun … if you can’t rant in your own blog and all that…

I do love an ellipsis…

Ascription is an Anathema to any Enthusiasm

Ben Hyde

Monthly Archives: March 2004

Talent Scrapping

Lost their bottle?

Portknocking

WMD: Warming of Mass Destruction

What if the Turing Test has a social network?

Choice

Terror, Spain, Elections

Eroding the Garden Wall

Out of Office

Info Axioms – Metcalf’s Law