Category Archives: General

Messages in Unusual Orders

I think I first saw this hack in an obfuscated programming contests where in it was used to implement the tape of a Turing machine. A Turing machine runs it’s tape both forward and backward. If you simulating that one obvious possiblity is a doublely linked list where each node holds a pointer to both the next node and the previous node.

So in node B you store a pointer to A and C. The hack saves memory by xoring A and C together and storing that. This works because as you travel foward along the tape and you arive at node B you will have just left node A so you can xor the stored A+C with A and out pops C. And you can do the same thing going in the other direction. Of course if you arived at node B via any other route the the A+C value is useless. For example you lose the ablity to delete B from the tape given a random pointer to it. I’ve never found a practical application of this hack other than saving space and obfuscating code. I’ve never ran into an application other than a Turing machine where I wanted messages that could be read forward and backward; though now that I’m thinking about it there might be some cases involving undo/redo.

In anycase I was reminded of this by looking at some of the schemes used in peer to peer distribution systems. In some of those systems the message is fragmented by the publisher and scattered across the the swarm of clients. They then collaborate to reassemble the message. It’s better if you can arrange things so that the lose of a few fragments doesn’t make it impossible to reassemble the message. A scarce fragment, like a patent on a key element of a standard, can frustrate the entire swarm. This makes the system brittle and raises the stakes for getting the coordination right.

The publisher can solve that. He can pump up is output and provide some redundent info. That removes the scarcity, relaxes the need for perfect coordination. Brittle no more. The easy idea is he just distributes multiple copies of the content; but consider this trick. Just as a very primitive example. The publisher breaks the content into two fragements A and B. He then ships three packets: A, B, and A xor B. Consumers can then reassemble the content given any two of the three packets.

Here some links: used in streaming, used in file transfer, pretty picture the example above.

Getting the publisher’s content moved thru the swarm is the same problem as getting a message passed thru a lossy channel. That problem that’s gotten a lot of attention for the last 50+ years. The world is full of lossy channels so if you go look at any heavy used lossy channel you’ll find tricks of the trade for solving this problem. Netnews for example. When clueful people (and pornographers of course) distribute large files thru netnews they use special archive formats that include a set of PAR or Parity Sets that provide redudency for to enable the consumer to reconstitute the archive given only a subset of the messages that dispatched it.

52 Card Pickup

More on peer to peer. Recall that what’s cool about peer to peer is that it allows a publisher to reach a bizillion consumers while only expending a few units of bandwidth. He shifts the distribution costs onto the consumers. It makes for a wonderful example of the classic standards problem – “If everybody would just…”

One aspect of this that I’ve been playing with is how the publisher might frame the game to encourage the swarm to collaborate. The BitTorrent technique of this kind is to fragement the content and scatter it into the swarm.

At that point the publisher sits back and says “Oh boys, you sort it out.” It’s a variant of 52 card pickup. Coordinating the resulting card game is the trick. In BitTorrent an entity known as the tracker helps along that coordination. There is a lot of design space to search around how to orchestrate the coordination.

At the same time it’s a special class of market making and clearing house design. Part of the cost born by the consumers is the cost of keeping the market liquid. For example can you design the system so it welcomes and forgives poorly behaving swarm members while at the sametime holding onto the generous ones?

Headless Chickens

This is an amusing this attempt to label strategic approaches to Open source available to the software vendor.

  1. The truly committed
  2. The mixed-codebase
  3. The pragmatics
  4. The anti-strategist
  5. The headless chickens
  6. The in-denial
  7. The anti-OSS

Open Source has upset the apple cart of consensus about how to run a software company. Just consider a handful of the sources of risk (from Porter’s classic list). Open Source totally changed the nature of buyer power. Buyer can now engage much more deeply with the vendor, collabratively evolving the product for joint gains. That change is rough for both sides.

The inputs to software firms have changed, i.e. what Porter calls supplier power. The open developer community is a vast resource that you ignore at your peril. The software components you stand on have changed and many are now open. Even if you avoid the viral power of the GPL the culture of development has changed.

While not particularly unique to open source (i.e. it’s happening all across the information/knowledge industries) new entrants face much lower barriers today. That increases velocity as well as uncertainty.

In classical industrial frameworks a industry seems to sort it’s self out into a clear pecking order. It then settle into a kind of low key rivalous behaviour along the lines of that ordering. Buyers and sellers rondevous around the resulting arguement.

In a disruptive period the rules aren’t clear. The argument is always changing. In businesses where industries don’t emerge them never do.

I’m amused that list reflects an attempt to impose ordering on the vendors. To frame the measures of quality for CIO/CTO buyers. In that PR frame it’s clearly biased toward the author’s firm. That’s typical, it’s part of an argument that unfolds whenever an industry is in flux. What measures of quality will define the industry’s peeking order after things get sorted out is the key issue during the disruptive phase. So of course every vendor clueful vendor is desperate to make sure that whatever they have in the way of strengths become the leading attributes of quality.

Ice

New England used to ship ice all over the world. To store the product they build palaces in India. To assure that they got nice clear ice each fall they would dump poison into the lakes. In the winter they would cut blocks of ice and store them in huge buildings insulated with sawdust. Sometime after the industry collapsed these building burned in spectacular fires.

It’s hot and muggy here. I got to thinking about using ice for air conditioning. Apparently you need a big 25 foot cube of ice. The phase transition, liquid to solid, stores most of the calories. I gather that block has about 130 million BTU of cooling stored in it.

Converting that into the emerging standard unit of energy; 1 barrel of oil (5.8 million BTU each) that’s 22.5 barrels of oil. At 50$ each that’s $1,120 worth of energy. How big is a barrel of oil?

I wonder how hard it would be to convert my garage into an ice house? Can you be an eccentric if you don’t have a grand scheme involving ice. Maybe the Tammany Ice Trust will rise from the grave.

Plausible Premise

In a posting over at Gobal Guerrillas there appears this phrase “plausible premise”, like so: “Remember, al Qaeda (and to a lesser extent the US) set this new organizational structure in motion by providing a plausible premise for the war.”

My first thought was that’s a nice way to describe what your doing in a start up. Such enterprises are held together by a plausable premise in spite of unlimited uncertainty and risk. For many years people’s reaction to open source was that they found the entire premise implausible.

This seems like a phrase that would be fun to add to my toolkit. But you can get in trouble adopting a phrase casually. For example while I love the phrase “bias for action” I was discomforted to discover it’s roots. So, I poked around in various venues (google, google print, Amazon Amazon’s SIPS).

Apparently it lacks a strong bloodline. Sometimes it’s used in describe the premise in the plot of a bit fiction: “It’s an all too plausible premise of what would happen if Earth was visited by a superior, technologically, alien race.” It’s used in philosphy to get a premise introduced early and casually. It is very occationally used to describe a marketing process – “the email must create plausible premise that persuades the recipient to divulge personal information”. There is something called plausible reasoning, an alternative to deductive reasoning. But, I don’t see any use of this term plausable premise in that vicinity.

Some enterprising airplane book author should domesticate it and breed up a purebred. Meanwhile, you find the wierdest phrases at Amazon: existential foothold.

Skype Developer Network

Skype is starting to flesh out their developer network. They hired a guy from Microsoft. Now they have a blog. I doubt it’s a coincidence that the third posting is about the rules for using the Skype marks. In my experiance developer net managers spend way too much time worrying that bone. Maybe soon they will have an RSS feed for it.

They have a really facinating set of options for creating a platform. A significant foot print on the client side and some cool of technologies (encrypted pairwise peer to peer streaming). It makes for a facinating set of possiblities.

For example there is no reason why their technology couldn’t be used to stream data rather than voice. You can imagine factories where every machine is on Skype. Interested parties then subscribe to the machine’s rich presence information and from time to time call it and dialog about it’s realtime status.

domain specific languages

Back in the 1970s when I first learned Unix the key idea was that you built lots of clever tools which you’d then hook up in pipelines and shell scripts to solve larger problems. Tools like grep, sed, awk where exemplars of this approach. Tools like lex and yacc (now flex and bison) made it trivial to invent new micro-languages for the problem at hand. It was common to work on systems with a dozen lexer’s and parsers. The unix macro language m4 provides another scheme for getting a micro language.

One reason I got into Lisp was that after building dozens of systems using those design patterns I began to yearn for something that would unify it all. Perl, which came later, is one example of that urge comming to bearing fruit. Two things really got me interested in Lisp; symbols and macros.

Rainer Joswig put up a video (Quicktime via BitTorrent) showing a simple example of how you can use macros in Lisp to build a micro-language. It looks like he stumbled across an article illustrating the micro-language approach and it made him sad to see how horribly complex this approach is in other languages.

The example sets up a micro-language for parsing for call records dumped from a phone switch.
You get to watch a not a-typical Lisp coding session. He grabs some example data to play with. Then he grabs the text from the article that outlines the record formats. Thru the demo that text will get reformated into the statements in the new micro-language. The largest change that happens to that text happens early because he changes all the tokens so they conform to lisp coding conventions.

He then writes a bit of code to parse one line. After testing that code he will reformat it into a Lisp macro so he can stamp out parsers for each flavor of record. This is cool part. For a nonLisp programmer the going gets a bit rough at this point. The tricks of the trade for writting macros aren’t explained and so it’s probably hard to see how the macros work. For example he uses something called “backquote” to create templates that guide the expansions. The backquote templates are denoted with a single character almost the smallest character that could be used, i.e. the single backquote. When the templates are expanded they are copied and as they are copied bits are filled into them were ever comma (and some other syntax sugar) appear.

Lots of other things are glossed over, for example that when you compile a thing like this the macros are all expanded at compile time and don’t necessarily even survive into the runtime environment.

But the final finished micro language is very small and concise so it makes a nice introduction into the kind of programming that I miss when programming in other languages.

Thanks to Zach Beane for the putting together the BitTorrent, it only took 7 minutes to download from 20+ peers in the swarm. Rainer Joswig original post is here.

Meanwhile. Can it really be true that there isn’t a version of lex/flex that handles utf-8 or 16 bit characters? That is too bizzare. It’s as if the corner stone of an entire culture of programming hasn’t made the transition to the world of international data handling.

Robust scalefree networks

Scale free networks are said to be robust. But they are robust in only a peculiar statistical sense. Consider, as an example, the road network. The roadways in most regions are scale free networks. They are robust in the sense that if you select a random road in the network and remove it from service the effect on traffic is very slight. Obviously this is just a trick; the randomly selected road is typically going to be some minor dirt road way out on the tail of the distribution. The effect of it’s removal will likely go unoticed until somebody decides to drive up there next fall to collect a load of firewood.

Random attacks are only one kind of risk systems need to guard against. Pick the right intersection and you can do plenty of harm to a region’s road network. The random failure model assumes that the source of trouble is uncorrolated with the topology of the network. Scale free networks seem almost brittle if the failures are aligned with the hubs in the network.

There are key intersections in the local road network that often fail, because traffic is perfectly aligned with the network’s hubs and a simple fender bender can take the intersection out of service. It would be much worse than it is if the system didn’t adapt. The failing hubs attract the attention of highway engineers and they adjust the traffic flows to reduce the incidence traffic related failures.

Various sources of failure raindown on systems like these. If the rain is totally random then the scale free network is robust because the hubs are hidden in a mist of far far less critical hubs. If the system threads are aligned with the topology of the network, like traffic on the roads, then the system can respond in one of three ways. It can add resources to the hub. You can change the topology of the network. You can offensively attempt to stop the source of the failures.

For example shift this over into a biological frame. Radiation rains down on the animal continously triggering numerous tiny failures but the scale free system design makes that chance of total system failure from quite low. The body’s various organs serve as hubs and each one of the has evolved to accumulate redundancy and numerous of special case adaptions to deal with the problems that arise from the ebb and flow of the work they perform. And then there is always good hygiene.

But notice that changing the topology of the network isn’t on the list. I suspect that’s because I don’t really know anything about biology. Maybe that if you want to see changing topology you have to shift from single species to ecologies?

It’s hard for individuals to shift topology. The city maybe at risk because it has only one bridge over the river and the traffic engineer can say “you really need to build in some redundancy,” but it is hard to get a redundent topology to emerge if the system has already condensed into a topology without one. It’s particularly difficult if your going to have to shift resources to make it happen. Displacing people from their property or taxing the efficency of the existing system to fund the shift. For this reason systems tend to fail rather than adapt. It’s notable that the scalefree topology is actually resistent to change.

It seems relevant to point out at this point that there is a large section of the US highway system that isn’t scale free. The interstate highway system – built by military minds – isn’t. That seems relevant because if you fear an attacker who is extremely smart you may prefer to design a system that forgos some of the efficency advantages of a perfectly scalefree network and instead design a system that is topologically more random. Maintaining that will be hard though, because over time the system will evolve toward the scalefree pattern as it seeks the efficency of hubs. In fact over the decades the interstate highway system has become less a grid and more a scale free network.

The assumption that adaptible systems can always respond to any problem is just plain false. We don’t have two hearts and we are unlikely to evolve a second one. The only time that it’s true is when the attack on the system isn’t fatal, but only painful enough that it allows the system time to adapt. If the system is attacked at it’s hubs with sufficent vigor it will collapse.

Since scale free system appear in so many venues these points apply in many venues. The military comming out of the second world war was very sensitive to these issues and the design of the interstate highway system reflected that. They traded efficency for a more resilient topology. The design of the internet was similarly intentionally laid out with that in mind, though most of that’s been lost. I suspect this is one reason why London’s circle line, which forms a kind of hub, was the target of the terrorist’s attacks.

Lousy Streaming

The point that most caught my fancy in this fun ranting talk (realplayer) by the always interesting Andrew Odlyzko was one of his questions to the audience, i.e. why would you want to deliver a movie faster than real time?

He uses the Socratic method quite a few times during the talk with the usual collatoral damage that he looses control of the floor. The trick to getting past that problem is to enjoy the fun of trying to answer the question even though your not there. Just ignore the other students :-).

When I walk back from the library with a DVD in hand I’m streaming that movie faster than real time. When you download an MP3 to your iPod your moving it faster than real time. When an email message is deposited into my email client it’s moving faster than it can be read or written.

His point is that much of the industry enthusiasm for streaming content is misplaced. It’s enjoyably ironic to listen to that rant on a stream embedded in RealPlayer. If your really lucky it will pause to fill it’s buffers in the middle of the part of the rant were he dismisses the argument that streaming content is amenable to property rights protection.

This section resonated with me because I’ve been thinking a lot about streaming recently. And it sent me off thinking about the very idea of a stream. I get a stream of magazines and blog feeds into my life, for example. All those lumps of text and photos laid out pages are a kind of buffering. A way of making something asynchronous rather than synchronous.

One thing I’ve been thinking about a lot recently is collaborative streaming, i.e. ways to shift coordinate the broadcast of a stream across a large pool of participants. I want to do that to lower the barrier to entry for the broadcaster by raising slightly the costs placed on the audience members. I want to do that to change the nature of the cloud thru which the broadcast transits so that there is less power concentrated in a high capacity hub. Shifting, in the design space, were the the coordination, processing, and bandwidth problems are resolved out of the hub and into standards. The exchange standards then orchestrate the broadcast instead of the broadcaster or an intermediary.

My strawman for this is to use swarming peer to peer techniques. The stream broadcaster atomizes the stream and distributes the droplets across a swarm of participants. They then exchange the droplets, much in the manner of BitTorrent, to reassemble the stream. Of course nothing is very new in that design.

The swarm can also provide the other features you want in a streaming architecture. Time shifting, buffering, archiving, etc. I had fun puzzling about email lists from this point of view; might it be reasonable to shift to a model where email lists are distributed, archived, etc. via a peer to peer broadcast architecture. How would that be different from a group blog? It’s common to see the most usable archives for mailing lists maintained by the community around the list rather than by the single point where the list comes together.

Part his point in ranting about streaming (oh and I want to be clear that streaming was a very minor subplot in this talk) was how portions of the industry are caught in a set of interlocking delusions about what is important and thus what the future holds. That the fixation on streaming content has codependencies on the illustion that content is king, for example. The streaming enthusiasm is also codependent with quality of service arguements; if you going to stream content you need to get very high quality of service.

That arguement is surprisingly weak. Buffers are cheap. A lot of audience members aren’t particularly interested in watching your content on your schedule – i.e. time shifting is the norm not the exception. Dumb networks keep winning so arguments that run counter to that trend are inherently suspect.

Which got me to wondering exactly when does high quality streaming really matter? Two answers come to mind. There is the social reasons – there are things it’s hard to do outside a crowd – applause, wave your lighters, choral singing, debate. Of course of those arise from the comming together and don’t demand synchonisity. There are the options that expire. The betting window closes when the horse race starts. A PR person may want to nip a rumor in the bud sooner rather than later. The early bird catches the worm. Maybe it’s only one reason; maybe these are both a question of managing what options for action you have.

High quality streaming is a lot like colocation. You know distributed work, out sourcing, etc. Dr. Odlyzko spends some time on how “distance is dead” is another industry delusion. So maybe he’s trying to have it both ways.

(thanks to Paul for the pointer)