Parallax

David Huynh and I worked together at in the Simile project (simile.mit) for the past few years. We burnt through our funding and many of us have moved on; he moved to MetaWeb. MetaWeb makes the horribly named Freebase. Freebase is this era’s version of AI knowledge representation (think frames). People these days tend to mumble Semantic Web, Wikipedia, and Social as well when they talk about these ideas.

At Simile we were more focused on more formal collections, like library catalogs, and so our data sets tended to be more homogeneous than the sloppy mess you get in collections that emerge in open socially contexts like Wikipedia. We were often stuck with collections with a limited assortment of item kinds (books, authors, subjects followed by a short tail … satellite photos … audio tapes …). Things get more “interesting” as the collections become more unruly.

David has a beautiful new demonstration of some the ideas, often his ideas, about how to work on one of the tough problems in this space. What’s refreshing about David’s work is he’s found a portion of the problem space that isn’t cluttered with other people and prior work. Most of the Semantic Web work suffers from stepping on the toes of prior work – or worse ignoring – the excellent work done on knowledge representation over the last 50+ years. What David’s found is fresh turf to work on in the problem of how to allow users, meer mortals, to browse and search on top of these knowledge bases. That’s fresh because the AI crowd have always been so fixated on the singularity, rather than on helping people. Not a very social crowd the AI guys.

Before I start musing about what David’s demo shows, to me, you really have to watch the video. The rest of this will make little sense without that.

The idea here is that if you have done a search resulting in a set of objects you should be able to use that set to drive the next step in the search. In a sense this is the classic unix pipe idea refreshed, where each linkage in the pipeline passes a set to the following step. Or if your of a more mathematical mindset then each transform in the pipeline is a many to many mapping from items to items.

One of our problems in making these ideas work when we were at MIT was the lack of a sufficiently heterogenous universe of items; you really need a vast universe with lots of item types. His cities/buildings/architects example, or his presidents/offspring/schools examples helps to demonstrate that. What Freebase with it’s aggressive rolling up of things like Wikipedia provides is a step toward that rich universe of items.

If your user is very sophisticated then a command line UI like unix pipelines would be sufficient. You’d then give him various operators for mapping item sets into other item sets. He, being sophisticated, would then demand the ability to code up his own mappings (e.g within 20 miles/years/generations/links etc. etc.); and to annotate the items as he is working with (e.g. scoring and statistics). And needless to say he’d want the pipelines to persist and generate feeds for other pipelines of various kinds.

The problem become much harder if you want to draw in the non-sophisticated meer mortals. What faceted browsing does, and what you can see here repurposed, is it prompts the user with a reasonably clear signal as to what his next options are for moving forward in the search space. So if your shopping for lawn mowers a facet’d browsing UI can prompt with price categories, power sources, etc. to draw the user into the next step of the search. This is all well and good until you meet real world searches where the number of facets are huge and the number of options in each of them is even larger. Even if you pick something as dull as library books the number of facets runs up into the thousands. Facet browsing actually works best when the goal is to drive the user toward down the path to a single choice – i.e. shopping. No need to offer the shoe shopper a facet for labor practices used during manufacture, or materials used in assembly, or nation involved in the manufacturing.

David’s sliding plays the same card to help the user. He throws up a set of options for the user’s next moves. Note how the facet browsing UI is on the left, and the sliding is on the right. Note also how rich and there for difficult to present the options are for where to slide next. It’s admirable that the video doesn’t gloss over that – as illustrated by the example where he slides from offspring of presidents into educational institutions they attended.

I guess this note comes off sounding a bit cranky. Hopefully it won’t be taken that way. The problem is very hard and real. Search of this kind is going to common for some class of users in the days to come. The more progress that can be made on making the UI accessible to mortals the more widespread it will be. The brilliance of David (and Stefano’s, and David Karger) strategy in working thru this problem has been to ground their search for solutions in demonstrations that draw in actual users and work on actual pools of data. It keeps them honest and it creates feedback loops they desperately need.

Ascription is an Anathema to any Enthusiasm

Ben Hyde

Leave a Reply