The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.
A Bloom filter is a space-efficient probabilistic data structure that can be used to test whether an item is a member of a set of items. An empty Bloom filter is a bit array of m bits that are all equal to 0. The bit array uses k different hash functions, each of which hashes one item to one of the m bits in the bit array. When an item is added to a set of items, each of the k hash functions is applied to the item to generate k hash values, and the bits that correspond to the k hash values are set to 1 in the bit array. Therefore, when n items are added to a set of items, the k hash functions are applied to each of the n items to generate n*k hash values, and the corresponding n*k bits are set in the array, with relatively few of the hash functions generating the same hash values for different items. After all of the additions of items to the set of items, the proportion of bits that are still equal to zero may be calculated as being equal to (m−(n*k))/m, or 1−(n*k/m). Therefore, if any item that is not in the set of items is hashed by any of the k hash functions, the likelihood that the resulting hash value corresponds to a bit that is already set in the array equals 1−(n*k/m). To test whether an item is in the set of items, the k hash functions are applied to the tested item to generate k hash values, and the bits that correspond to these k hash values are tested in the bit array. If any of these tested bits is 0, the item is definitely not in the set of items, because all of the tested bits would have been set to 1 if the item had been added to the set of items. If all of the tested bits are 1, then either the item is in the set of items, or these tested bits have by chance been set to 1 during the insertion of other items, which would be a false positive match. Since false positive matches are possible and false negatives matches are not, a test of whether an item is in a set of items results in either the conclusion that the item is possibly in the set of items or the conclusion that the item is definitely not in the set of items. The more items that are added to the set of items, the larger the probability of false positive matches.