Category Archives: open source

Ha!

This slashdot comment reflects a common misunderstanding about how open source works…

“I just setup an Apache web server for use at home, and now I’ve got 4 Apache developers living in my basement. When they showed up, they said they were my Apache community overhead and I had to let them stay there. Oh, and I apparently have to feed them too!”

You do not need to feed them. If the demand for food is strong enough somebody will volunteer.

What is Social Software

There is a thread unfolding over here about this one liner:

“The whole point of social software is to replace the social with software”

But the thread has descended into a who said exactly what discussion that avoids the provocative nature of the standalone statement.

The statement is obviously true in for some situations.

For example consider a bug database. These tools allows a group of people to engage in the work of resolving bugs more effectively by draining most of social interaction out of the work. They enable the bug to found by one actor and resolved by another without the two interacting socially at all. Much the same way that doctors can treat desease without engaging in any particularly social interaction with their patients.

In a second example look at a problem that arises in source control systems. Two individuals are hacking away and their chanages happen to overlap. Mr. Speedy gets his changes into the source control system first. Mr. Methodical shows up later and discovers that his changes conflict with Speedy’s. There is nothing more commonly used as an example of social than conflict resolution and here we have exactly that problem a tiny dishpan model. The conflict resolution that happens at this point might demand a social interaction, but we have discovered that in a surprisingly large number of cases it works out well to just dump the whole problem into Mr. Methodical’s lap and let him puzzle out a solution. In this case the software design has done exactly what the quote suggests; replaced the social with software.

Or consider the wiki. I stuff some useful content into the wiki. Another actor dives in and rephrases it into grammatical english. A third actors repairs a date I got wrong. That process is hyper-effective because none of the actors need to engage in a social interactions. Each actor bears only the cost of his contribution, but none of use have to orchestrate a social relationship with each other. Most people’s initial reaction to wiki’s is bewilderment because they are so extremely a-social. It takes a while before you discover that can be a positive.

Social relationship creation and maintenance is costly. In numerous situations it is absolutely worth those costs. But that does not mean it should be automatically tacked onto every interaction. Fixing a bug, resolving a source code conflict, touching up a wiki entry, can be an oportunity to met new people and make new friends; but they do not need to be forced to serve that function.

I was quite conscious of this when I added the “report a typo” link that appears below all my blog postings. I know that most people don’t complain about my many typos because to do creates a delicate social dynamic. The form found under that link therefore doesn’t even prompt for an email address. I made the choice that I was more likely to get useful typo reports without the social aspect and that was a better balance of design than improving my ablity to say thank you to people do provide typos.

That tiny example shows the kind of tuning about social that systems of this kind enable.

A more accurate statement might say:

One exciting aspect of social software is the option of removing social aspects from the interactions.

If you want to get all big picture-ish … the whole point of the scientific revolution was the discovery that you could making amazing progress on some problems if you discarded all the important stuff. That how fast an object falls is not related to how much you love it; and the weather tomorrow is not related to your attitude about the rain gods. Autism can be surprisingly useful.

One way to frame the problems social software is dealing with is to label them as coordination problems. The bug fixing, wiki refining, and source control conflict resolution are all coordination problems at their core. They all run the risk of reaching bogus outcomes if you drain off the social elements entirely. The system failures that arise when that happens are well known. For example there are libraries full of books on what happens when product development becomes divorced from the end user’s needs and situations.

What’s exciting about open source is that it lets you experiment with exactly how to set the knob on how much social you leave in the coordination scheme you deploy.


Via the typo link, this extremely insightful addition: “And social interactions often draw from a limited pool, so by removing the need for them with software, this pool can be conserved and applied to actions with a greater possible return.”

Reading Old Code

About 25 years ago Symbolics and Lisp Machines spun off from the MIT AI lab. MIT licensed the intelectual property around a machine known as the CADR, or MIT CADR, to these companies. This was a landmark event in the history of open source. This event made it clarified a lot of people’s thinking, particularly Richard Stallman.

Licensing the CADR community’s code to private interests shifted the incentives around that Lisp community. In effect MIT transfered ownership of the community, which it did not own, to private interests. Where previously the motivations for community member were intrinsic, social, and in the final analysis about common cause afterward the motivations were explicit, commercial, and foremost monetary. The resulting boom and bust of the community was quite a ride; and just short of fatal.

Again and again over the last 25 years various parties have made a run at getting the MIT intelectual property licencing office to relinquish the rights to the CADR, to move them into the commons. At long last Brad Parker has succeed. Go Brad!

Looking at the code from today’s vantage point is kind of recreational. Most of the files are from 1980. There is one file dated 1973; making it the oldest file, who’s data seems accurate, on my machine.

Lisp machines instruction sets were sympathetic to the implementation of dynamically typed languages. With support for ephemerial garbage collection, exception handling, dynamic binding, etc. etc. The CADR was microcoded. The sources include all the microcode.

The 1970s marked the acceptance that people would get their own computer with a big display and related software. The code contains a complete window system.

While object oriented programming runs back at least as far as Simula circa 1968; it didn’t really begin to win the day over efficency until people found they were racing each other to see how fast they could build a window system. We still see people searching for the right cut point between class system and primitive types. It’s interesting to see that the class system in the CADR appears to have treating all the types all the way down as first class instances. But it looks like this code doesn’t use multiple inheritance!

When MIT did this, the larger Lisp community had lots of branch offices, each with it’s own dialect of Lisp. Common Lisp the ANSI standard dialect that attempted to reduce that diversity and create some standard around which people could rendezvous came latter. The elegance of the Common Lisp dialect, particularly the beauty of it’s original specification, was an important part of what triggered my decision to switch over to Lisp.

The dialect of Lisp used for the CADR is interesting. Consider this routine that returns a tree of cons cells representing the class hierarchy:

;Returns a tree whose leaves are instances of CLASS-CLASS (DEFMETHOD (CLASS-CLASS :CLASS-CLASS-HIERARCHY) () (CONS SELF (COND ((EQ CLASS-SYMBOL 'OBJECT-CLASS) NIL) ((ENTITYP SUPERCLASS) (< - SUPERCLASS ':CLASS-CLASS-HIERARCHY)) (T (MAPCAR (FUNCTION (LAMBDA (X) (<- X ':CLASS-CLASS-HIERARCHY))) SUPERCLASS)))))

Methods were apparently invoked by the function “< -" and named using symbols from the keywords package, i.e. :CLASS-CLASS-HIERARCHY. In common lisp methods are simply functions so where this code has (< - SUPERCLASS ':CLASS-CLASS-HIERARCHY) in Common Lisp you’d write (CLASS-CLASS-HIERARCHY SUPERCLASS).

In that routine the object the method is working on is lexially bound to the variable SELF as it’s fields (or slots) of the class instance. That kind of syntatic sugar has fallen out of favor. These days you’d have to explicitly bind those.

I wonder about the quote on ':CLASS-CLASS-HIERARCHY; was it already vestigial?

The sources include some amusing litter. At the time in the MIT used a networking design known as Chaosnet; a variant of Ethernet. Apparently the physical address of hosts on that net were selected so they revealed the machine’s physical location along the cable. Sort of how many nanoseconds your machine is from the terminator on the cable. I’d totally forgotten how fickle those early networks were.

I actually doubt that history would have unfolded very differently if MIT had relinquished the license back in 1980 rather than in 2005; but it’s a debatable point.

Update: John Wiseman write: “I worry that the Lisp community’s fascination with the past is mostly pathology at this point.”

Absolutely true. All tiny ethnic enclaves have that problem. Members of a diaspora can and ought to say such things. Outsiders are best advised to mind their own business.

In Praise of Tweaking


In Praise of Tweaking: A Wiki-like Programming Contest by Ned Gulley, The MathWorks, Inc. is very very cool. The folks at MathWorks make a programming platform for scientists and engineers. So they have a developer network that they manage consciously. Like most developer networks they have contests. Contests can be great for generating some excitement around your platform. If done right they they can cause the community to discover and reveal innovations that might have otherwise remained hidden.

This paper is neat because is shows a really cool hybrid of open source collaborative development, wikis, massively multiplayer games, and objective based management techniques. (I think I deserve a very high score in for that sentence of buzz word bingo!)

What they have set up is a way to run the contests where the entries are like pages in a Wiki. Moves in the resulting game consist of revisions to an existing entry; or possibly the creation of an entirely new entry. Because they can set up a simple score (say number of lines and CPU time) each entry can be scored instantly. This creates a kind of stadium were players and spectators can rendezvous.

Let’s look at some of the ways this solves canonical problems in the open source design space. For example one model of open source says that the work proceeds in two phases; the developer refines the existing code to achieve a benefit that is local and then in a second phase he reveals the work publicly. Open source projects have trouble creating a climate where the second step happens. In this example you can’t play the game unless you reveal your work; so an incentive is created to reveal. But better yet there is an incentive to reveal as soon as possible because somebody else might reveal faster. It creates both a incentive to reveal and a bias for action.

Unlike an eBay auction where bidding last is best this is a game where bidding early and often is your better strategy. Interestingly sooner you enter the game the more chance your entry will become the basis for later entries. Since entries show their pedigree there is an interesting additional bais for action that early movers may become the basis for a large portfolio of entrants. That large portfolio means you have an increased chance of being a contributor to the winning entry.

Another standard problem in Open Source is how to avoid having the project’s core group establish a membrane around the project that rejects contributions from the outside. This is the other side of the how to encourage developers to reveal their contributions. The core group’s problem is how to balance the maintenance of various qualities (design, safety, limited forking, etc.) against the need to make the cost of revealing very very low for outside developers.

This example is a delightfully extreme case. The single naive metric used to compute the score becomes the only definition of quality; so that doesn’t need to be computed using social processes. The question of forking is solved by turning the knob to eleven – forking is the default. Every entry is a fork. At any moment in the game the entry that becomes best might evolve from any of the older entries. Crazy! It’s yet another example of how source control and the nature of information goods make these systems so different than real world ones.

One of the questions around open source is how much of the value in an open source code base flows from the contributions of a few v.s. from the many. People who think that status games are a key salient element of open source tend presume that a few status seeking contributors make the majority of the contributions. This is a question about the statistics of the contribution flows. I assume and there is some data to support that the contribution flows are power-law distributed. I.e. they are highly skewed with a few contributors doing a lot of work and a huge population of contributors doing small amounts of work. I also assume that the severity of the skew is due, in part, to how successfully the projects solve the problem of lowering the barrier to revealing your work back to the core.

One sweet aspect of this paper is that given the simple metric of quality they use to score the game they can cough up some kind of answer to this question. The paper’s title “In Praise of Tweaking” gives it away. Most of the improvement in the final score emerged out of tiny tweaks. But that’s really too simple. I love this bit: “Long stretches of tweaking battles can be suddenly punctuated by dramatic shifts in the code. When one of these big shifts occurs, it also opens up fresh opportunities for tweaking, and swarms of curious competitors descend upon and begin tightening up the new leader.” That pattern happens, of course, in real open source projects just not a quickly.

Well, read the paper! There must be hundreds of places around the edges of open source projects were these techniques could be tried.

Upgrade your Open Source License, Cash back on your cell phone bill!

This is a note about how to save a few hundred dollars on your Verizon cellphone bill, and why you should seriously consider switching from a BSD or old Apache style license to the new cooler Apache 2.0 license.

Standards reduce the diversity of behavior. Reducing that diversity creates efficiencies and free up resources for other activities, other kinds of diversity. In some cases the efficiencies are huge, as in the example standard of driving on the right. In other cases the efficiencies are subtle, as in knowing somebody is in your tribe and can be trusted to share a stake in the tribe’s commons.

To get a feel for how diverse a range of behaviors appears in the real world it helps if you can get a statistical distribution. For example I’d love to know the distribution over various forms of greetings: the quaker handshakes, namaste, high-five, etc.

Generally these distributions are power-law. The chart on the right shows the distribution of various open source licenses. It’s pulled from an earlier posting.

When a new kind of behavior appears on the scene you get a lot of diversity. People experiment with assorted approaches. Different people care about different things. Some people want a very short license. Some people want credit for their work. Some folks are concerned about protecting the commons. Other people want to encourage adoption. People revise sentences to make them more readable. Lawyers practice their craft, inserting proven boiler plate or look out for whatever they happen to think is in their their clients’ best interests.

These processes generate a lot of diversity, a lot of bogosity, and some innovation. Clearly the entire idea of an open source license was a huge innovation. The discovery that the license could protect the commons was huge. That licenses effect how your code creates and taps network externalities is still not fully understood and even less fully appreciated.

There is a lot of mimicry and random mutation. For example the Apache Group mimicked the license on BSD. A lot of people mimicked the Apache license. Some of those mimics just change the name of who held the copyright, but a lot of them added, removed, or rewrote clauses for various reasons.

This early stage, the bloom of diversity, is followed by a period of consolation. At one level that’s kind of sad. Some cool innovations die out, for example some of the rewrites that made the license more readable don’t survive. Some of the innovations fall by the way because they aren’t tied to the wagon of one of the big winners.

Some of it is good, very good. Craft knowledge accumulates. Interoperablity is enabled, Resources aggregate around the winners. The good ideas are aggregated. The newer Apache license is a perfect example of this process at work. The new license maybe a lot longer, which sad, but it’s a lot more robust. It solves a number of important problems. Problems that really need to be addressing. For example it is a lot more careful about protecting the code from malicious infection by contributor IP rights. It also solves some perfectly silly problems, like how to avoiding having to put your entire license at the top of every source file.

It’s interesting how the revision of licenses is exactly like the problem of upgrading an installed base of software. All those licenses that mimic the older Apache license are like an installed base. It’s very hard to get them to upgrade. The classic metaphor for upgrading an installed base is: Build them a golden bridge, and then set a fire behind them. I doubt anybody can implement that plan in the open source licensing world. I suspect people will try. But that metaphor is an interesting example of how a seemingly minor detail in the license in one time frame can become extremely valuable in a later time frame. It’s one reason that many agreements between a firm and a consumer typically contain a clause that allows the vendor to casually change them later. I gather that Verizon recently changed their cell phone contract and one fall out is that the subscribers can bail without paying the early termination charges.

It is clear to me, that people in the Apache or BSD licensing community would be well served by upgrading their licenses to the new Apache license. Just to be clear that doesn’t imply assigning copyright to the foundation. The new license is just plain better than the old one.

The license is here and the FAQ is here.

Asterisk: The Greasemonkey of Telephony

I love it! GreaseMonkey as role model.

But, it’s an excellent analogy. Asterisk is a cool platform for clientside hacking. I gather there was a time long long ago when all the innovation in telephony ask taking place around PBX being sold to corporations. I gather that bloom of innovation was rolled up and moved back into the central offices after a while.

There is another round of innovation happening around asterisk.

Brian discussed some really creative uses, like an Asterisk based system using a webcam and other components costing under $100 that will ring the line of your choice when it senses movement. Or one of his students, who’s built an Asterisk based wakeup call system built for and used by his peers. Or a system that pings connections for his wireless ISP, and if receiving five timeouts places a call to the support technician including directions, problem description and more. Or a system that allows for a business with offices in geographies as varied as Boston and Tokyo to not only route support calls automatically to the office that’s open, but allow for local calling between them.

Asterisk isn’t particularly friendly, but even so none of those is particularly difficult. I wonder if there is a web site like userscripts.org for asterisk hacks; the closes thing is voio-info.org but that’s more a developer support site rather than a hub of hacks.

Open Source Office

I see that Sun has set up an Open Source Office in a further attempt to bring some coherence to their strategy and tactics for relating to the open source phenomenon.

This kind of activity can be viewed from different frames. I, for example, haven’t the qualifications to view it thru the Java frame. But let me comment on it from two frames I think I understand pretty well.

Sun has done some reasonably clever standards moves over the years. As a technology/platform vendor the right way to play the standards game is to use it as a means to bring large risk adverse buyers to the table. Once you got them there you then work cooperatively with them to lower thier risks and increase your ablity to sell them solutions. Since one risk the buyers care about is vendor lock-in (and the anti-trust laws are always in the background) the standards worked out by these groups are tend to be reasonably open. Standards shape and create markets. Open enables vendor competition.

This process is used to create new markets, and from the point of view of the technology vendor that requires solving two problems. First and foremost it creates a design that meets the needs of the deep pocket risk adverse buyers. Secondly it creates a market inside of which the competition is reasonably collegial. The new market to emerges when you get the risk percieved by all parties below some threshold.

Open source created a new venue, another table, where standards could be negotiated. Who shows up at this table has tended to be different folsk with different concerns. That’s good and bad.

The open source model works if what comes out of the process is highly attractive to developers (i.e. it creates oportunities for them) and the work creates a sufficently exciting platform that a broad spectrum of users show up to work collegially in common cause to nurture it.

The goals of the two techniques are sufficently different that both approachs can use the word open while meaning very different things. It has been very difficult for Sun to get that. For example the large buyer, risk reducing, collegial market creating standards approach talks about a thing called “the reference implementation” and is entirely comfortable if that’s written in Lisp. The small innovator, option creating, collegial common cause creating standards approach talks about the code base and is only interested in how useful as feedstock for the product they are deploying yesterday.

It’s nice to see that Sun has created an Open Source Office; it’s a further step in coming to terms with this shift in how standards are written and the terms that define the market are negotiated. But, my immediate reaction was: “Where’s the C?” as in CTO, or CIO, etc.

What does the future hold. Will firms come to have a Chief level officer who’s responsible for managing the complex liason relationships that are implicit in both those models of how standards are negotiated? I think so. This seems likely to become as key a class of strategic problems as buisness development, marketing, technology, information systems, etc.

Open source changes the relationship between software buyers and sellers. It has moved some of the power from firm owners and managers down and toward the software’s makers and users. But far more interestingly it has changed the complexity of the relationship. The relationship is less at arms length, less contractual, and more social, collaborative, and tedious.

This role hasn’t found a home in most organizations. On the buyer side it tends to be situated as a minor subplot of the CTO’s job; while of course the CIO ought to be doing some as well. On the seller side it’s sometimes part of business development or marketing even. That this role doesn’t even exist in most organizations is a significant barrier to tapping into the value that comes of creating higher bandwidth relationships on the links in the supply chain.

This isn’t an arguement about what the right answer is because the answer is obvious some of both models. Some software will be sold in tight alignment with carefully crafted specifications and CIOs will labor tirelessly to supress any deviance from those specs. Some will be passed around in always moving piles of code where developers and users will both customize and refactor platforms in a continous dialog about what is effective. The argument here is about how firms are going to evolve to manage the stuff in the second catagory. That’s not about managing risk, that’s about creating, tapping, collaboratively nurturing opportunities.

Giving it all away, just to get rich.

Paul Kedrosky writes about the source of superior investment returns; which states that the key is to find the actionable investments where you disagree with the consensus. Certainty, say that gas prices will move in a direction different from the consensus estimate, is one thing but then what action do you take?

In my noodle this got mixed up with a model of group forming I’ve been meaning to write up. It’s from a paper that Karim passed along. In this model the group has a consensus model of the world, while it’s members have their personal models. Call these vectors in belief space. Over time the group’s model shifts, if you are optimistic about the wisdom of crowds it shifts toward the truth. The member models also shift toward the group’s model via socialization, indocrination, etc.

It’s fun to draw analogies between that model and some of the other models (1, 2, 3) of groups.

More interestingly though. One of the canonical dozen questions about open source is why members freely reveal: relinquishing ownership what are presumably valuable bits of knowledge to the commons. You can ask that same qustion in an investment context.

Let’s say you are confident that we are about to transition over Hubbard’s peak and the price of oil will sky rocket. Do you reveal this knowledge to others? To first order the investment answer is no. Instead you go find actionable investments (say long haul railroads on the upside and trucking companies on the downside) and then you just stand by and wait to get rich.

But investing is a funny thing. You get your prize when the consensus model comes into alignment with your investment. Predicting the movements of the consensus is what counts, that’s only somewhat aligned with accurate predictions about reality. When calculating when you’ll be able to cash out the question isn’t “when will I be proven correct” it’s “when will the crowd believe I’m correct.” The sooner that happens the better the investment oportunity.

Investments based on any number of certain events (peak oil, global warming, China’s currency, the US balance of trade, Iraq’s oil reserves, etc. etc.) are most highly leveraged just at the moment before the crowd begins to take them seriously.

For that reason you might freely reveal your more accurate model in the hope of accellerating the crowd’s phase transition out of it’s delusions.

One driver of free revealing can be an entirely selfish attempt to increase the net present value of your investments.

Mozilla Corp.

Cool, the Mozilla Foundation has budded off a commercial taxible subsidiary. I agree with Karim. This is a very exciting development.

While we have seen numerous attempts by commercial firms to capture some of that Open Source magic. Most of these have come from people who’s motives are principally commercial. Now there is nothing wrong with those motives, but it tends to color their attempts. The motivations that serve the establishment and stewardship of a rich open commons tend to move progressively (sic) to the back burner.

It is difficult to create a hybrid in the space between these two very distinct ethical frameworks. It is not entirely clear if one even exists. What is clear though is that a lot of people from the commerical side are searching really hard to find one. I’m always happy to see search parties heading out from the nonprofit side of the space.

This is a particularly important one though.

My bemused characterization of the driving force for most open source start ups goes as follows: On the one hand we have free stuff! On the other hand we have rich CTO/CIOs! We will just stand in the middle and make money! It’s a plausible premise.

If you stick a firm into that gap there are a lot of other aspects to bridging between those two, it’s not just money. For example on the Open side you have a high value placed on the creation of a huge pool of options; while on the commerical side you have a high value placed on minimizing risk and maximizing predictablity. On the open side you have a enthusiasm for rapid release and adaptation. On the commercial side your required to synch up in tight lock step with the buying organization’s schedules. On the open side the evolution of the project is a continous negotiation among the projects particpants; a deep relationships. Participants are often locked-in. On the commercial side the relationships are kept at arms length with contracts, specifications. Buyers strive to commoditize markets with multiple vendors, avoiding lock-in. I could go on.

There is arguement to be made that the CTO/CIO side of these businesses should adapt. I have no doubt that over time they will. For example I suspec that CTOs will adapt before CIOs. But it is always hard to shift an installed base. It’s obviously hard when you dig into all the APIs of complex peice of software, like Microsoft Windows. But it even harder when you dig into the complex tissue of social webs. Changing the rules for how firms manage software isn’t easy. That’s why the CIO organizations will shift more slowly than the CTO organization; one has a much more complex social web to adapt. At minimum a much larger one.

But back to the reason why the Mozilla move strikes me as important. It’s not just that I’m glad to see experimentation comming out of the open side of things.

Firefox is key. Installed base on the client side is key. To reach large swaths of market share the Mozilla community needs to solve a consumer marketing problem. That includes finding the ways and means to move the product down the existing distribution channels. Thos channels are directly analagous to gaps between the open source community and the needs of the CTO/CIO software users.

It’s my hope that the Mozilla Corp. can enable them to leverage those channels.

Just to mix the two examples together. Consider how hard it is for a CIO to justify installing Firefox rather than IE given how extensible it is. While for a open source guy that extensiblity looks like oportunity for the CIO it looks like increased risk and hightened support costs. An open source guy thinks Grease Monkey is cool. It makes the guys in the IT department quake in their boots. A varient of Firefox that addresses their concerns is a no brainer. It gives the CIO access to the vibrant innovation around Firefox, but it allows him to limit the risks.

Exciting.

Digital Fountains in the Walled Garden

Bummer.

Returning yet again to the topic of how the producer of an information good can shift the distribution load out onto a swarm of consumers. The producer can package his content in ways that make it easier and more likely that the swarm can collaborate. For example he can distributed redundent copies of the data or provide lots of checksums.

The extreme case of this is sometimes called a digital fountain, like a water fountain. The producer sprays a stream of droplets and if the consumer can catch a cup full then he can reassemble the original cup of content. And it turns out there are some very effective algorithums for enabling just that.

Here’s a short and simple survey paper on digital fountains (pdf).

…an idealized digital fountain should …

  • A source generate a potentially infinite supply of encoding packages from the original data. Ideally, encoding packets can be generated in constant time per encoding packet given the original data.
  • A reciever can reconstruct a message that would require k packets to send … once any k encoding packets have ben recieved. This reconstruction should also be extremely fast, preferably linear in k.

Amazingly there are designs that come extremely close to that goal. But.

… Most of the work that has been described above has been undertaken by employees of or consultants with the company Digital Fountain, Inc. … the company has a number of patents issued and pending that appear to cover both the fundamental theory and specific implementation designs …

So, no network effect and this stuff will get designed into only those industrial standards that are pay to play. Damn.