Today's computer systems are constantly being pushed to achieve ever-greater system performance. Computer engineers and software developers leverage a variety of techniques and approaches to increase performance. For example, software engineers may expend considerable effort optimizing computer code by utilizing space-efficient and/or time-efficient data structures and algorithms for solving particular computing problems. Research into general solutions (e.g., data structures, algorithms) aimed at solving commonly arising software problems has been prolific, though more solutions are regularly needed as new problems arise.
Meanwhile, rather than optimizing for a particular application, computer architects push system performance by concentrating on producing hardware that can execute more instructions in less time. Much of this effort has focused on exploiting instruction-level parallelism in applications. For example, over the years, computer processor speeds have been increased by utilizing deeper instruction pipelines, out-of-order and/or speculative instruction execution, effective branch prediction, and various other techniques. By exploiting instruction-level parallelism, computer systems may effectively mask the effects of high-latency instructions on system performance. However, more efficient techniques for exploiting instruction-level parallelism in computer architectures are needed.
In addition to producing techniques for exploiting instruction-level parallelism, in recent years, system architects have also concentrated on producing systems capable of exploiting thread-level parallelism, such as by utilizing multi-threaded and/or multi-processor systems. A multi-processor system may comprise multiple physical and/or logical processors, each capable of concurrently executing a different thread of instructions. In some systems, multiple concurrent threads may each access and/or operate on common memory locations (i.e., shared memory). For correct execution, such systems may require various concurrency control mechanisms, which may limit thread-level parallelism in some instances. Much research in recent years has concentrated on developing concurrency control techniques for ensuring correct program execution while also maximizing the amount of thread-level parallelism exposed to the system. For example, transactional memory is one such concurrency control technique. By exposing more thread-level parallelism, a multi-threaded system may be able to execute more efficiently and thereby increase performance. However, further optimizations for concurrency control techniques, such as transactional memory, are required.
One common problem faced by both software engineers and computer architects in implementing high-performance systems, is that of quickly testing set membership. For example, a system may be configured to observe a series of values over time and then, given a query value, quickly and efficiently determine whether the query value was among the observed values (i.e., whether the query value is a member of the set of observed values).
Some systems may solve such a problem by employing a “Bloom filter,” which is a probabilistic data structure for testing set memberships. Traditionally, a Bloom filter defines a binary array (initialized to all 0's) and K hash functions, each configured to output hash values that are valid indices into the binary array. A Bloom filter may “observe” a given element (of a set) by performing an insert operation on the given element. The insert operation includes calculating K indices into the binary array by applying each of the K hash functions to the element and using the output of each hash function as a separate index into the binary array. The insert operation then ensures that the binary array holds a value of 1 at each of the K indices. To determine whether a given element is in the set, a Bloom filter may be configured to again calculate K indices into the binary array by applying each of the K hash functions to the element. If the binary array is 1 at all of the K indices, then the Bloom filter determines that the element may be a member of the set (i.e., may have been previously inserted).
A Bloom filter may allow false positives but not false negatives. That is, Bloom filter may only definitively conclude that a given element is not in the set of observed elements, but not that it is in the set. However, this guarantee is sufficient for many scenarios.