Hashing is a technique commonly used in sequential set implementations to ensure that these method calls take constant time on the average. A hash set (sometimes called a hash table) is an efficient way to implement sets of items. A hash set is typically implemented as an array, called the table. Each table element is a reference to one or more items. A hash function maps items to integers so that distinct items almost always map to distinct values. Java provides each object with a hashCodeQ method that serves this purpose. To add, remove, or test an item for membership, the hash function is applied to the item (e.g. modulo the table size) to identify the table entry associated with that item. This is known as hashing the item. In conventional hash-based set algorithms, each table element refers to a single item, which is an approach known as open addressing. In closed addressing, each table element refers to a set of items, traditionally called a bucket.
Any hash set algorithm must deal with collisions and determine what to do when two distinct items hash to the same table entry. Open-addressing algorithms typically resolve collisions by applying alternative hash functions to test alternative table elements. Closed-addressing algorithms place colliding items in the same bucket, until that bucket becomes too full. In both kinds of algorithms, it is sometimes necessary to resize the table. In open-addressing algorithms, the table may become too full to find alternative table entries and buckets may become too large to search efficiently in closed-addressing algorithms
Conventional hashing suffers from a variety of problems when it is applied to multicore or multithread processors that are increasingly common in computer systems. In fact, conventional hash tables are playing an increasingly important role as search structures for concurrent programs by providing search functionality with low cache coherence overheads. Such conventional hash tables are also being used extensively in state of the art virtual machines and software transactional memories. Increasingly, resizing these conventional hash tables is an important challenge since these structures tend to grow significantly with use over time.
Specifically, a problem exists with all current resizable lock-based hash tables, including the ones in the Java concurrency package. Such conventional implementations resize the hash table bucket structure but not the set of locks that protect the buckets. Instead, an approach is used where a given lock protects several buckets. As the table increases in size, the fixed set of locks is forced to protect a growing number of buckets. Eventually, contention for these locks among competing threads will take a toll on performance as multiple threads seek access to the ever growing number buckets that each lock is forced to protect.