Previously I wrote a up a sketch for how blog readers migh guard their privacy by forming reading clubs. The club would reveal the union of what everybody is reading but inside the club it wouldn’t be possible to discern what each member was reading. Last time I stated that engineering this wouldn’t be particularly difficult; but didn’t reveal my sketch for how it might be done.
My idea is/was that the club members would from into a circle. A stream of encrypted traffic woul arrive from the left and be passed onto the right. This traffic would be chat about club business; i.e. what blogs the club is interested in along with notices of the state of those blogs. For example an entry might state “The club is interested in blog vile.example.com as of Jan 12 2006.” or “The blog at embaress.example.com was checked on a 11:14am Jan 21; it last change at 2:37 Dec 17th.” If you record the assertions in the stream for a sufficent period of time you can form a complete model of the blog reading club’s interests and the state of the blogs it’s reads.
My presumption was that individual club members would inject their interests into the stream. This turns out to be harder than I thought. If I inject my interest in lame.example.com into the stream my upstream neighbor can only tell that the club he’s a member of has increasingly lame interests, but if he collaborates with my downstream neighbor then he can pin the blame on me for that. Not good. Ben Laurie pointed this out to me.
The full set of assertions collected by listening to the passing stream is a substitute for a centralize club house were the club keeps their records. That club house is a substitute for the ping aggregation service, i.e. the intermediary the club was meant to avoid. The whole point of this exercise is to hide the reading interests of individual club members from the intermedary.
Ben’s somewhat spontanious suggestion for how to organize this club is to build a club house but run it so the individual members interactions are kept anonymous. Systems like TOR or the anonymous email remixing system illustrate how to let the members communicate with the club house anonymously.
The club house could be a simple web site that enumerates all the blogs the club is keeping an eye. It drops blogs off the list if nobody signals in interest in that blog for a period of time. Club members randomly poll blogs on the list and report what they find back to the club house; including the RSS feed should it change. When a member wants to read his blogs he does sync’s his copy of the club house data. This syncronization can be done in public if the club member is willing to reveal that he is a member of the club. Of course he should synchronize the full database from the club house since otherwise he’d reveal his peculiar interests. By extablishing an anonymous connection to the club house, ala TOR, the member could avoid pulling down the entire database.
While last time I wrote that this engineering a system like this is straight forward, this time I’m less confident of that.
I’m not particularly happy with the introduction of a central club house into the design. Who’s going to volunteer for that thankless task? I’d rather liked the idea that the club members were all asked to carry the same proportion of the load. But now I’m thinking that the streaming around the circle approach is just a scam for relocating the club’s records; and that I”m gotten myself out out on a limb.
Designing distributed anonymous peer to peer databases that enable clubs like this to form looks like a more meaty design problem than I expected. While I’m sure that’s fun for some folks it’s a bit of a barrier to making progress on the problem I care about.