# Bloom Filters

A friend of mine saw an article on slashdot about the LOAF system. LOAF uses an algorithm called a "Bloom Filter", named for it's inventor Burton H. Bloom, that allows you to share your contact list in a social network kind of way. It's pretty dang cool by itself. But uppon further investigation of the algorithm, the bloom filter is just as interesting.

There are several examples of this algorithm in varuous languages such as perl and python but I couldn't find any in a .NET language so I created the following implementation as a starting point

public class BloomFilter
{
private BitArray _bits;
private SHA1 sha1 = new SHA1Managed();

private ushort[] GetIndexes(string key)
{
// TODO: make this call multiple hash functions or salt SHA1
byte[] hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(key));

// SHA1 generates a 20byte hash (160 bits) so treat those
// 20 bytes as 10 16 bit integers.
ushort[] idxs = new ushort[10];
for (int i=0,j=0; i<20; i+=2,j++)
{
idxs[j] = (ushort)(hash[i]<<8);
idxs[j] |= hash[i+1];
idxs[j] = (ushort)(idxs[j] % _bits.Count);
}
return idxs;
}

public BloomFilter(int size)
{
_bits = new BitArray(size);
}

{
foreach (int idx in GetIndexes(key))
_bits.Set(idx, true);
}

public bool Contains(string key)
{
// check that each bit of the hashes
// are set in the filter
foreach (int idx in GetIndexes(key))
if (!_bits.Get(idx))
return false;
return true;
}
}

I know I'm kind of cheating by using only one SHA1 hash but hey, this is just the first shot at it.

Right off hand there seem to be hundreds of places that this could be used to speed up performance. Email opt out lists, spell checkers,  mmorpg games with lots of inventory items.

"My mind is aglow with whirling, transient nodes of thought careening thru a cosmic vapor of invention." - Hedley Lamarr in Blazing Saddles. :-)