A bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set, and was first introduced in B. Bloom, “Space/Time Tradeoffs in Hash Coding with Allowable Errors,” Communications of the ACM 13:7, pp. 422-426 (1970). The bloom filter is a succinct representation of a set of data, wherein the bloom filter data structure is more efficient to operate on and/or electronically transmit than the data set which it represents. An empty bloom filter is a bit array of m bits, all set to zero (0). For the filter, there are k different hash functions defined, each of which maps or hashes some set element to one of the m array positions with a uniform random distribution. To add an element, the element is fed to each of the k hash functions to get k array positions. The bits at all of these positions are set to one (1). To query for an element (test whether it is in the set), the element is fed to each of the k hash functions to get k array positions. If any of the bits at these positions are 0, the element is definitely not in the set—if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then either the element is in the set, or the bits have by chance been set to 1 during the insertion of other elements, resulting in a false positive. However, for certain data applications in which the bloom filter is used, a false positive can be tolerated when compared with the operation and transmission efficiencies that flow from its use.
Assume however that a data set changes over time, whereby elements of the set can be inserted and/or deleted. Inserting elements into a bloom filter, as explained above, is easily accomplished by hashing the element k times and setting the resulting bits to 1. However, deleting an element cannot be accomplished simply by reversing the insertion process. If the element to be deleted is hashed and the corresponding bits set to 0, a location may be errantly set to 0 that is hashed to by some other element in the set. In this case, the bloom filter no longer correctly represents all elements in the data set. To address this problem, L. Fan et al, “Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol,” IEEE/ACM Transactions on Networking 8:3, pp. 281-294 (2000), introduced a counting bloom filter. In a counting bloom filter, each entry in the bloom filter is a counter rather than a single bit. Thus, when an element is inserted into the bloom filter, corresponding counters are incremented, and when an element is deleted from the bloom filter, corresponding counters are decremented. While it is beneficial that a counting bloom filter supports both element insertion and deletion operations, a counting bloom filter typically requires about three to eight times more memory space for storage than a basic (bit-based, as described above) bloom filter.