Bloom filters are a special type of data structure that can be used to indicate whether a specific data pattern has been previously observed. Basic operation of a bloom filter is depicted in FIG. 1.
As observed in FIG. 1a, an input value 100 (e.g., a plurality of bits) is presented to the bloom filter 101. The input value 100 is then used as an input to N different hash functions 102_1 to 102_N. The output of each hash function corresponds to a location in a data store 103. Thus, the presentation of the input value 101 generates the identity of N different locations in the data store 103.
As observed in FIG. 1b, each of the N different locations is then “looked up” from the data store 103. In a traditional implementation, each location in the data store keeps one bit of information (a P bit). The data store is originally initialized with all such bits being set to zero. Assuming input value 100 represents the first input value presented to the bloom filter 101 after its initialization, the lookup of the N different locations will produce N zeros (i.e., each looked up position in the data store 103 will present a zero). The bloom filter then proceeds to write a value of 1 into each of the N locations of the data store 103. In this case, all N locations will flip their storage from a 0 to a 1.
FIG. 1c represents the bloom filter at some later time when the same value 100 is again presented to the bloom filter 101. Execution of the N hash functions 102_1 through 102_N will cause the same N locations as previously identified to be looked up from the data store 103. This time, however, all N bits that are looked up will be equal to one (having been written into that state at the completion of the operation of FIG. 1b). All looked up bits being set equal to one signifies that the input value has been presented to the bloom filter previously.
Thus, if there is some sensitivity to the fact that the same value has appeared previously, the bloom filter 101 can be used to identify whether or not a particular value has appeared before. According to the mathematical properties of a traditional bloom filter, it is possible that a lookup of N bits will yield all ones when in fact that input has not been presented before (“false positive”). However, a traditional bloom filter will not yield anything other than all ones if in fact the input value has been presented before (i.e., false negatives are not possible).