The privacy of crowds

Most solutions to the RSS/Ping/Forward-chaining problem have privacy issues.

Consider the simple big-service-in-the-sky approach; i.e. introduce an intermediary. The intermediary then does the polling for the users and the users check up with the intermediary from time to time. The privacy problem with this approach is that that the intermediary can capture a model of the user’s reading habits. That’s a bummer.

Here’s an alternate design, mostly just to show it’s possible, of a way you might temper that problem.

Groups of users band together to form a crowd and this crowd fills the role of the intermediary. The crowd collaboratively polls all the crowd’s interests. Each node knows the full set of crowd interests, but it has no way to know who injected a particular interest into the crowd. Interests can time out over time, but interested nodes can insert them before that happens.

With a two exceptions each node runs much as it would otherwise. It randomly polling sites whenever it’s model of that site’s status gets too stale. Of course it has a much larger cache since it’s drawing sites from the union of the crowd’s interests. Of course it has the added work of keeping a much larger cache and maintaining synch with it’s neighbors.

Once groups form it’s fine, at least from a privacy point of view, for them to turn around and subscribe to the services of a large scale intermediary like Feedster, Technorati, or Pubsub. Those updates can be pulled into the crowd in bulk. They can even be pushed to the crowd from the intermediary.

The engineering for a thing like this isn’t very complex. Reengineering the existing feed readers would be.

Ascription is an Anathema to any Enthusiasm

Ben Hyde

2 thoughts on “The privacy of crowds”

Leave a Reply