The first time I saw bloom filters used was in a spelling corrector; they provide a quick way to assert if X is has any chance of being a member of a set; so they make a good fast front end in lots of validation situations.
The most recent time I saw bloom filters suggested from something was Loaf. The core idea in Loaf is that there is a lot of useful information in people’s address books that could help filter spam. The challenge is how to capture that information so it can do good in the world while protecting the privacy of all concerned.
I wrote a critique of Loaf here. The problem with Loaf is that you can gain some knowledge of what’s in your correspondent’s address books; and even some is probably too much. I just updated that posting to point out that there is a work around for that. Here’s the work around.
I am a members of a club. I reveal my address book’s bloom filter to the club. The club can then strips off my identity and reveals the filter. Users of the filter don’t know it’s mine, they only know that it’s the filter of some random club member. If the trust distribution over members of club is good enough – large and generally trustworthy – then the collected filters of a mess of members can go do useful work in the world.
This is similar to the idea of having a club generate an anonymous persona for a member, mentioned in this posting. Both ideas allow an individual to gain some anonymity by getting lost in the crowd of club members. In that sense it’s similar to the way that TOR provides anonymity by letting you fold your internet traffic in with the traffic of a crowd of other users (i.e. members of the TOR using club).
The operation the club is performing on my address book’s bloom filter in the example above can be generalized. I pass a lump of data to the club along with proof of my membership in the club. I ask the club to sign the data and state on that signature; treat this data as having been signed by some member of the club. That strips off my identity. If I didn’t want to strip off the identity when then I could have signed it my self, possibly using keys I got from the club’s certificate authority.
This becomes a privacy service of the club.
Consider this scenario. I join a library, anonymously. They give me a membership card – presumably some digital object signed by the library. I didn’t tell them who I am, or where I live, or etc. etc. I really value my privacy! But the library isn’t willing to let me borrow any books. They need to know two things first. Do I live in town and will somebody they trust vouch that I return my library books. So I go to the town’s privacy server and have it add an assertion to the library card that says “lives in town” and I go to my university’s privacy server have them add an assertion that says “four years a student, returned all books.” My true identity is still not revealed on the library card; but now they know all they need to know and they are willing to grant me the right to borrow books.