Monthly Archives: January 2009

Off to NYC

Wife and I are off to NYC.  Taking the bus down tommorrow (Ok, I’m an idiot since there is a snow storm predicted), and then returning late in the day on Saturday.  It would be fun to met up with people.  If anybody would like to get together; please get in touch!  There’s a Korean place on 36th near 5th that I love and hopefully we will have dinner on Friday night.  We are also hoping to get to see Dan Hurlin’s Disfarmer, tommorrow night if the God’s cooperate.  We have a foolish plan involving ferryboats!

Some Logging Rules

I seem to be building logging infrastructure today.  I keep recalling one or another of the rules for playing this game.  Might as well try to put them down.

  • Who? – The speaker’s unique ID and type should be in each log line.
  • Transcript – The speaker’s utterances should have a serial number, so you can notice gaps.
  • Checksum – A running check sum is a big help in proving things.
  • When – The utterances should have a time stamp (daemontools multilog t is good)
  • Synchronize our watches – NTP is a must everywhere.
  • Breadcrumbs – Jobs/tasks/work-items/requests should have a unique ID that is threaded these across to the logs and across process/module/machines
  • Health – All processes (machines, threads …) should emit a heart beat; heart beats should include some health indicators so other parties can notice when they expire or get sick
  • Replay – Logs that enable a rebuild from last snapshot will save your butt.  Often your close and only some minor optimization (truncating output, discarding binary info, say) is preventing it.  I once rebuilt an entire source repository from years of mail to prove an intrusion had not touched the sources.
  • Syntax – It’s good if the logs are well tokenized, i.e. embedded strings are escaped; and character encodings are worked out.
  • Standardized – It’s good, but it’s hopeless. This is the worst case of the 2nd part of “”Be strict in what you send, but generous in what you receive”
  • Innummerable – You can lay an ontology over the space of exceptions.  Accept that, and then proceed as usual.
  • Now – The sooner the log analysis takes place the better.  Don’t wait until your patient is in intensive care.  Analogy: test driven development.
  • Email – The accumulated headers in modern email are full of lessons learned
  • FSEvents – the asynchronous file system journaling/notifications of (BeOS, et. al.) are worth looking at closely.
  • Fast – I tend to embrace that writing the log is not transactional or even particularly reliable so I can have volume instead.

Geek culture and it’s counterculture sell-by date

This is great, and my first reaction is yes, yes.

For many cultural and artistic movements there is a “counter-cultural moment” when a previously obscure and out-of-the-way tendency suddenly gains prominence. It may be the surrealists in the ’30s, Bauhaus in the ’20s, the hippy movement in the late ’60s, punk in the late ’70s – each had a few years in which it was both important and yet still opposed to the mainstream. But that counter-cultural moment is short, often just a handful of years, and once it is over, the movement either becomes the mainstream or fades away. Only nine years after the summer of ’67 the hippies were a thing of the past and punk set itself up in direct opposition to the pompous and irrelevant dinosaurs that “progressive rock” bands had become.

“Geek culture” has had its counter-cultural moment. It’s over.

It is now eight years today since Wikipedia started, four years since blogging, ten years of Google. These formerly “hungry upstarts” are now the establishment. Netflix partners with Wal-Mart, Google partners with CBS, Amazon is bigger than any of those “establishment” bookstores it challenged and pushes around publishers with impunity. Geek culture is now mainstream. Google, Amazon, Netflix have passed their counterculture sell-by date.

But yet again.  There are some distinctions to be teased out.  For example, some of what is “geek culture” is indistinquishable from the techno-scientific-industrial revolution; and that firestorm just keeps going and going.  Some of what is “geek culture” is the rise of groups (scale free) condensing in the network; and that just keeps happening.  And while the cultural bit is all true there it’s bigger than that – touching all the social sciences.

On balance, and I’ve felt for quite a while now,  we went thru a real phase change when the Internet went mainstream.  At the moment it transititioned it was a real oh-dear/oh-my-goodness moment.  We are on the otherside of that.  There is plenty of remaining turbulance yet to be played out.  Plenty of solid bits waiting to be be liquidated.  Some, like the newspaper industry or walmart, are plenty big.    But the big-bang appears to be behind us.

Managing EC2 instances

I’ve been meaning to write this up for the last few months but I wanted to get the code into a shape I liked first.  The way EC2 works you spin up fresh virtual machines by doing:  ec2-run-instance AMI-123 … and so the natural thing to do is if you need machines configured for different roles you take the time to stamp out a different AMI (aka Amazon Machine Image) for each role.  That’s tedious.    Alternately you can gin up an image which learns what role it should fill as it starts up.  Some folks already build their AMI to let you do that.  For example the kind folks at Alestic provide assorted AMI that include this feature: “On first boot, runs instance user-data script if it starts with #!”.

So here’s what I do.  First I arrange to have the custom personality packed up in what I call a start-up-bundle.  I put that someplace.  In the example below it’s at: <http://example.s3.amazonaws.com/starters/start-up-example>.  I prefer to have the file publicly available, so that fetching it trivial.  But since it’s content is private I encrypt the file.  In the example below the password for decrypting the file is “stirringdullroots”.

To start an machine up and have it take on this personality I then do: “ec2-run-instance ami-115db978 -f start-up-example-userdata …

This is what you’d find in the file start-up-example-userdata, i.e. the script which grabs the starter bundle, decrypts it, and then recurs into the bundle to let the personality take shape.

#!/bin/bash
trap 'echo ERROR Bootstrapping failed; exit 1' ERR
echo "Bootstrapping from <http://example.s3.amazonaws.com/starters/start-up-example>"
set -o pipefail
cd /tmp
curl -s "http://example.s3.amazonaws.com/starters/start-up-example" | openssl enc -d -aes-256-cbc -k stirringdullroots | tar x
echo "Unpacked: `find start-up-bundle | wc -l` files"
cd start-up-bundle
echo "Run start-up scripts for various modules"
for i in */start-up.sh ; do
MODULE=`dirname $i`
echo "Starting module $MODULE: via $i"
(cd $MODULE; bash start-up.sh)
echo "Finished starting module $MODULE"
done

So each startup bundle has two parts the userdata script, and the encrypted lump-o-personality.

Of course there are scripts for building start-up-bundles and other orchustration of the scheme, but that’s the heart of the story.

As you can see each machine’s personality consists of a set of modules.  I have a collection of standard modules for things like java, perl, erlang, tinydns, rabbitmq, sbcl, etc. etc.  One nice feature of this design is that it should be possible to make and share these with other folks.

Witches

Economists will appreciate that anthropologists have noted that scarcity is one the reasons why old ladies sometimes transition from Nana, to  crone, to shunned, to  witch, and are then murdered. Food gets scarce and somebody’s got to go.  I see that Google let go “about a 100”  recruiters.

bAnd, I see that Google is setting some products aside. I’ll miss one of these: catalog search. It was fun, take a peek before it’s gone.

Those that live on the bottom lands, out on the long tail, suffer the most when the storms come or times get tough. Easier on the conscious if you label them as unworthy.  Of course declaring them witches is a bit over the top. There are other moves. For example you can set them free. Sort of like how polygamist societies shun their young men due to a shortage of wives you can always donate your weaker products to the open source community  (don’t forget the patents)

Some of those kids will thrive.  Sometimes they get adopted by a stronger family.  I wonder if the mashup editor team is getting folded into the Google App Engine team.  But then I also wonder about the health of Google App Engine.  I recently paged thru their example application catalog and it’s a lot thinner than I would have expected.  There was almost nothing in there that I thought: “Yeah!  I should blog this.”  That is  not good.

Kremlin watching.

“You Asked for It!”

Idiots.  Comcast upgraded their email system over night and now I have thousands of duplicate messages in dozens of mail folders.    And, so does everybody in my household.  Guess who get’s to clean up this mess?

The email announcing this wonder (Smartzone(tm) Communications Center) includes the phrase: “You Asked for It!”    It would have been more polite of course if they had given us all fair warning, before throwing this punch.  You know as in: “Ok buddy, you asked for it.”    This is has an abusive relationship.  Their product managers must be really really happy about their switching costs.

Demand the Surprises

While doing a bit of work helping The Echo Nest get their developer network rolling I got to observe an amazing outrageously cool example of what can happen when you open up your technology.

This bends one of my blogging rules: no blogging about the job.  This time it’s a consulting client.  But the gig is all done and I’m not revealing anything proprietary.

I treasure examples of why relinquishing control of your technology is a good move.  Because bewilderment is often the first reaction when I suggest it.    And then, most technology owners don’t seem to like the explanation, which seems straight forward to me.

My favorite answer for why this can work: searching for cool applications demands skills and attitudes that the firm lacks.  These are on the demand side.  They are close to the problem the user needs to solve.  I love this answer because it’s symmetric – scarcity on both sides.  The firm should not horde it’s options because the knowledge to act on those options is scare.

You can frame this answer as a search problem.  Searching the option space created by the new technology requires all the usual stuff: capital, talent, knowledge, an appetite for risk, and intimacy with a high value problem.  Delegating the search problem to the third parties works well because they bring increased knowledge, because they understand the problem being solved.  The firm only understands the technology being applied.  The developer in your developer network brings a heightened appetite to solve the problem, because it’s their problem.  It is perversely fun to note that the 3rd party will take risks the firm would never take; they might be small, foolish, impulsive, or very large and self-insured.

This isn’t the only workable model for a developer network (there are, just to mention three: commoditizing, standardizing, and lead generating models).

But if this is the model your using you can begin to set expectations.  A successful developer network must create surprises.  If the search created by the developer network  does not turn up some surprising applications of your technology it’s probably not working yet.

When it works the open invitation to use your technology creates a stream of surprises.  Expect to be bewildered. Curiously the somewhat bewildering decision to relinquish control, if successful, leads to yet more bewilderment.  But, surprise comes in many flavors.  You may be envious because the third party discovers some extremely profitable application of your tech, as Microsoft was when the spreadsheet and word process emerged in their developer network.  You maybe offended, as some of us in Apache were when violent or pornographic web sites emerged in the user base.  You maybe disappointed, as I was when the market research showed that most spreadsheets had no calculations in them.    You are often delighted as I suspect the iPhone folks were when this somebody invented this wind instrument based on blowing on phone’s microphone.

Dealing with the innovations created in the developer network can be quite distracting.  It’s in the nature, since the best of them take place outside the core skills of the firm.  That means that comprehending what they imply is hard.  Because of that I seem to have developed a reflect that treasured these WTF moments.

So, One aspect of managing a developer network is digesting the surprises.  To over simplify there are two things the developers bring to your network: a willingness to take risks, and domain expertise.  The first means that you often think, golly that’s seems rash, foolhardy, and irresponsible.

Consider an example.  It is very common to observe a developer building a truly horrible contraption.  They use bad tools, in stupid ways, even dangerous, ways.  And just as your thinking “oh dear” they get a big grin on contented face.  If that happens inside an engineering team you’d likely take the guy aside to discuss the importance of craftsmanship.  Or, if you’re a bit wiser, you might move him into sales engineering.  That kind of behavior is not bewildering; it’s a sign of somebody solving a problem, creating value.  Value today, not tomorrow.  It’s a sign of an intense need.  Now intense need is not enough to signal a high value product opportunity, for that you also want the need to be wide spread.  Once you get over yourself, and learn to appreciate the foolhardy, you can start to see that it is actually a good sign.

But developers don’t just bring a willingness to take risks. They can also bring scarce knowledge that you don’t have.  I love these because it’s like meeting somebody at a party who’s an expert in some esoteric art you know nothing about.  It’s a trip to a foreign country.  It’s the best kind of customer contact – they aren’t telling you about their problems they are revealing intimate information about how to deal with those problems.  Like travel to a foreign land it is, again, bewildering.

We caught one of these last fall at The Echo Nest.  I love it because it is so entirely off in left field.

The folks at The Echo Nest have pulled together a bundle of technology that knows a lot about the world of music.  They have given open access to portion of that technology in the form of a set of web APIs.  So they have a developer network.  They have breadth of music knowledge because their tools read everything on the web that people are saying about the world of music.  They have in depth understanding by virtue of software that listens to music and extracts rich descriptive features about individual pieces.  It is all cool.

The surprise?

Last fall Phipip Maynin (a Libertarian, a theoretical finance guy, a hedge fund manager, and one of the developers in The Echo Nest developer network) figured out how to use the APIs to guide stock market investments.  How bewildering it that!  He started with a time series – hit songs – and ran the music analysis software on that series.  He then gleaned out correlations between the features of those songs and market behavior.  He reports that work in this paper: Music and the Market: Song and Stock Volatility.

It is a perfect example of how the developers in your network bring unique talents to the party.  I doubt that anybody at The Echo Nest would have thought of it.

I often get asked were the money is in giving away your technology in some semi-open system.  The question presumes that hording the options the technology creates is the safest way to milk the value out them.    If you start from that presumption it’s a long march to see that other approaches might generate value.  What I love about this example is how it is a delightful value counter point to the greed implicit in that hording instinct.  What’s a more pure value generator than market trading scheme?

Of course now I’m curious: anybody got any examples of trading schemes based on the iPhones, Facebook, or Romba platforms?

More Three of Kind

These are some three of a kind examples accumulated over the years.  Much thanks to my various correspondents.  I ought to sort these out a bit.

Many are the top three of power-laws; i.e. Hertz, Avis, Budget.  Once you’ve noticed that you can use any sharp power-law to generate three.  For example the three top words in  english: spoken: The, You, I; written: the, of, and; adjectives: other, good, new.  And many of the geographic ones are like that, just forced onto the landscape: England, Scotland, and Wales.

Many of just regions along some natural scale: federal, state, city for example.  There are lots that are on linear or cyclic time.

There are a number that are triangles; and then you can create a plane, or balance your three legged stool.  Ordering the triangle – you have three objects and you link them into a circle with arrows rather than mere lines.  Then you can play rock paper scissors; or polish your mirror.

There are number that are pairs with a middle: buyer, seller, middleman; man, woman, relationship; in, out, door; etc.  In this context I find the triple: reflective, transparent, opaque thought provoking.

Apparently in some languages there are three words: one for a thing near me, a second for a thing near you, and finally a word for a thing distant from both of us.  But foreign languages are not my thing. In Japanese? – koko/soko/asoko

I’m particularly amused by this group:

  • solid, liquid, gas
  • ground, sea, air
  • army, navy, air force
  • missiles, subs, bombers
  • beast, fish, fowl

Well, here goes:

  • Race, Language, and Culture
  • Buyer, Seller, Middleman
  • fast, good, cheap
  • culture, structure, market
  • ethics, choice, rules
  • 3-D
  • ON, OFF, Don’t Care (1,0,X)
  • “Is cup half empty or half full?”  … “Who dirtied the glass?”
  • “That sword cut’s both ways.”    … “Ok, let’s talk about the sword.”
  • “Men, Women” … “Shall we talk of relationships?”
  • Moe, Larry, and Curly
  • Groucho, Chico, and Harpo
  • Knife, Fork, and Spoon
  • Bell, Book, and Candle
  • Lock, Stock, and Barrel,
  • Butcher, Baker, and Candle Stick Maker
  • Father, Son, and Holy Ghost
  • Motive, Means, and Opportunity
  • Red, Green, and Blue; Cyan, Magenta, and Yellow
  • Bombers, Missiles, and Subs
  • Army, Navy, Air Force
  • Three legged stool
  • near you, near me, away
  • Animal, Vegtable, Mineral
  • Three wise men
  • Harry, Ron, and Hermione
  • Thesis, Antithesis, Synthesis
  • Bad things come in threes
  • Bowens system theory
  • Learning: acquisitive, integrative, mastery
  • Three points define a plane
  • The summer triangle: Deneb, Vega, and Altair
  • Meat, Fish, Fowl
  • England, Scotland, Wales; New York, New Jersey, Conneticut
  • Id, Ego, Superego
  • Earth, Heaven, Hell
  • Liquid, Solid, Gas
  • Faith, Hope, Charity
  • See, Hear, Speak (no evil)
  • Fates: Klothe, Atropos, Lachesis
  • Liberté, Fraternité, Egalité
  • Life, Liberty, and the Persuit of Happiness
  • Three trials or tasks in fairy tales
  • Thee Musketeers
  • Past, Present, Future
  • Grammar, Logic, Rhetoric
  • Reading, ‘riting, and ‘rithmetic
  • Perl, Php, and Python
  • Equatorial, Temperate, Artic
  • Right, Left, Center
  • King, Queen, Jack
  • Black, and White, and Red all over
  • Waltzes: 3/4 time
  • Thesis, Antithesis, Synthesis
  • Lather, Rinse, Repeat
  • small:medium, large
  • Ford, GM:Chrysler
  • Solid, Liquid, Gas
  • I, you, (he, she, it)
  • Earth, Wind, Fire
  • Breakfast, Lunch, Dinner
  • Gas, Brake, Clutch
  • Sharp, Flat, Natural
  • Preprocess, Compile, Link
  • Stop, Drop, Roll
  • paper, scissors, rock
  • Oceania, Eurasia, Eastasia
  • thesis, antithesis, synthesis
  • up, down, strange (and continuing, charm, truth, bottom-beauty)
  • Judaism, Christianity, Islam
  • right, wrong, nuanced
  • certitude, discourse, terrorism
  • Brahma, Vishnu, Shiva (creater, preserver, destroyer)
  • Urth, Vertandhi, Skuld (the Norns, representing past present and future)
  • Executive, Legislative, Judicial
  • Breakfast, Lunch, Dinner
  • Tom, Dick, and Harry
  • Huey, Dewey, and Louie
  • City, Suburb, Rural
  • foo, bar, baz
  • shake, rattle, roll
  • reflective, transparent, opaque
  • soprano, alto, tenor
  • dna, rna, proteins
  • beauty, truth, form
  • less-then, equal, greater than (<, =, >)
  • binary trees: left, right, ancestor or mother, father, child
  • over constrained, well posed, under constrained
  • chew, digest, defecate

DVD -> Streaming

About once a year I dabble with trying to setup a video streaming server so I can watch movies on my Mac which doesn’t have a DVD reader on it.  Usually this all falls apart because the debug loop is the scale of a DVD, the free tools are all full of personality, and the world of video encoding is quite confusing.  This time I think I got it.  So here some tricks I picked up along the way.

Darwin Streaming Server isn’t hard to set up.  I have it serving up files who’s extension is m4v, which I assume stands for mp4 video.

I use HandBrake on the Mac to rip the DVDs.  In fact I use HandBrakeCLI to rip them.  A command like: HandBrakeCLI --preset Universal --input /dev/disk3 --output foo.m4v  rip and convert DVD into a single file that is 1 to 2.5 Gig in size.

While that foo.m4v file can be served by  Darwin Streaming Server your better off if you add so called hints to the file.  These are apparently nonstandard mark up that Apple invented to make it possible to seek to arbitrary points in your movie.  While there are signs that “QT Sync” can do hinting I ended up using mp4creator to do it.  In addition to hinting you can optimize, so while I have no idea what that does, I do that as well.  Hinting is a pain because you need to do one pass to hint the video track and another pass to hint the audio track.  So you end up doing something like this: mp4creator --hint=1 "foo.m4v" ;  mp4creator --hint=2 "foo.m4v" ;  mp4creator -optimize "foo.m4v"

In that example the numbers 1 and 2 are denote the track number to hint.  Those are reasonably dependable; but to be save you can use  mp4creator -list foo.mp4v to be sure.  I wrote a bit of perl to glean out the right numbers and gin up the commands to do the hinting.

Usually when you stick a DVD into a mac the operating system launches the DVD player, but you can change that behavior in the CDs and DVDs Preference Panel.  Using that I launch an apple script.  Since I’m more fluent in shell scripting I have that turn around an launch a shell script.  That script devolved from other examples I found around the net.  The essoteric bit-o-clever in it are useful; so it appears below.  It has an unfortunate flaw that the heuristic for guessing which disks are DVDs sometimes gets the wrong answer; but it’s good enough for now.

My shell script then: 1) runs HandBrakeCLI, 2) ejects the DVD, 3) hints, optimizes, and 4) moves the result into  appropriate  directory for the Darwin Streaming Server.  A little cgi script then enumerates what’s available to machines on my household’s local network.

 

global dvdPath

try

    tell application “System Events” to set DVDs to name of disks whose capacity is less than 8.589934592E+9

    set dvdPath to quoted form of (POSIX path of item 1 of DVDs)

      do shell script “/Users/bhyde/bin/snarf_dvd.sh “ & dvdPath

end try