Computer systems often make use of hash tables to optimize data searches. Hash tables are widely used in computer program products as this construct can be used to provide fast access to many different kinds of data structures.
Data items referenced by a hash table are characterized uniquely using some properties of the data (the use of multiple properties to uniquely identify such data is common). For a hash table referencing a given data set “S”, the design and logic used to access the hash table will permit a computer program product to either locate target data (“D”) or determine that it does not exist in “S”. For a set “S” containing N data items, if a linear search is performed (comparing each item individually to “D”), such a search could in the worst case scenario (where “D” is not present in the data set), require accessing all N items in the set “S”.
A hash table is a table of a size which is typically much smaller than N. Hash tables are typically rounded to a size X which is either a prime number or the next power of 2 (still typically much smaller than N).
To implement a particular hash table, a hash function is selected based on properties identifying each data item “D”. The hash function may be applied to any data item “D” to generate a hash value. The properties used as an input to the hash function can be a subset of the total properties necessary to uniquely identify each data item. In typical applications, the hash value is the size of a word (often 32 or 64 bits). A hash index can then be generated from the hash value using modulo arithmetic carried out on the hash values of the elements in “S”. This is expressed as HASHVALUE % X=HASHINDEX (X is the size of chosen for the hash table).
Any data item “D”, being defined by a unique set of properties, some or all which are used in generating its hash value (and then ultimately through the hash value in its hash index), can thus only map to one entry in the hash table. Each entry in a hash table, and the associated data items, is called a hash bucket or bucket.
Multiple threads (or processes), may access a hash table and the associated data items at any given time. In a typical application, such as a database system main memory buffer system, the set of database pages (or their header or directory information) that are present in main memory may be hashed. Typically in such an application the hash bucket contains a pointer to a first data item. Data items themselves include pointers that permit them to be arranged in a linked list. Adding an item to a hash table is accomplished by placing the item in the appropriate linked list pointed to by the appropriate bucket in the hash table.
A thread looking for a particular page “D” in the buffer system would calculate the hash index of that page and perform a lookup in the hash table to determine if the page is in the database's main memory buffer. If it is not present in the hash table buckets (the pointer in the hash table bucket points to a linked list and the linked list does not contain the page) then the thread will cause the required page to be read from disk to the main memory buffer. The page will be added to the linked list pointed to by the appropriate hash bucket so that the next thread that is looking for the same page will be able to use the hash table to locate the page in the memory buffer.
Hash tables are typically kept consistent by the use of concurrency primitives called latches, with one latch for each hash bucket. This latch is often called the bucket latch. Threads wishing to parse contents (the data items in the associated linked list) of a particular hash bucket will take the bucket latch to ensure that the contents of the bucket remain consistent while the threads access items in the bucket. A thread inserting a new item into a linked list pointed to by a bucket will obtain the bucket latch to ensure that no other threads are parsing through the linked list of the bucket while the thread modifies the content of the bucket or the linked list. The bucket latch is often implemented as a full function latch where threads that are parsing through the contents of the bucket take the latch in share mode (so that multiple threads can look at the contents of a bucket concurrently), while those that are inserting or removing data items into a bucket take the latch in exclusive mode. As the above indicates a hash table bucket may include a latch and a pointer to the first data item in the linked list of data items that is related to that bucket.
The ability to handle concurrency in a hash table implementation becomes important when a hash table is resized. Resizing is desirable when the number of data items in the table becomes large relative to the number of buckets (causing the number of data items in the linked list for each bucket to grow). The most straightforward way to carry out a resizing operation is to first lock out all threads accessing the hash table and then to redistribute the data in the table. This can be done by creating one new latch (in addition to the existing bucket latches) for the entire hash table (or alternatively, by latching all buckets). Threads wishing to use the hash table would have to take the entire hash table latch (in share mode) before proceeding to access the bucket of interest. This is not a desirable approach because it locks out all users of the hash table during the resize.
Another approach avoids having to latch the whole hash table. Following this approach, each bucket is individually split (typically a binary split). In this approach, each bucket data structure includes a bit indicating whether the bucket is split or not, as well as data that pointing to the new buckets, if any. This approach is also limited. For example, the granularity of the split is limited in this case. In addition, it is typical for a thread or process accessing a particular hash table to store a copy of a calculated hash index value in a local variable or a register (this is referred to as maintaining a cached copy of the index value). Keeping the hash index value in a local variable or register in this way permits the thread or process to access the data item using the hash table without having to recalculate the hash index for the data item. Where a bucket is split as described above, any stored index values must be discarded, as they will no longer be reliable.
A hash table is itself typically implemented as a contiguous piece of computer memory (an array of 0 to X−1 hash bucket elements). The structure of each hash bucket in the generic hash table has two contents, i) the bucket latch and ii) a pointer to the start of the linked list of data items present in the bucket. Two adjacent buckets (e.g. 0 and 1), are thus separated by sizeof (Latch)+sizeof (FirstDataItemPtr). On a typical 32-bit computer system, a simple Latch is 4 bytes and a pointer in memory is also 4 bytes. The size of a hash bucket is therefore typically 8 bytes. Thus the starting points of two adjacent buckets (and hence two adjacent latches) would only be separated by 8 bytes of memory.
Since a typical data cache line on modem computer systems is 128 bytes (dependant on the processor architecture), the end result is that multiple buckets (and hence multiple bucket latches) will exist on the same cache line. On Symmetric Multi Processor (SMP) computer systems, multiple processors share the same main memory resources and have physically separate data caches (each processor typically has a data cache on the processor chip itself). As a result, on such systems there is a concept of cache line ownership. Since the data in the cache line is really shared data, expensive cache synchronization must occur across the processors to ensure that any cache lines that exist in more than one processor's cache are consistent.
Since any thread T0, running on processor P0 accessing bucket 0 must take that bucket's latch, the processor P0 will take ownership of the cache line that the latch exists in (since the latch is marked as “Taken”, the cache line has been “Dirtied” by processor P0). Thread T1, running on processor P1 accessing bucket 1 (a completely separate bucket) must “take” that bucket's latch, causing the processor to dirty the same cache line that P0 owns. This results in expensive cache synchronization across P0 and P1 (in some schemes, P0 will have to provide the updated cache line to P1 by snooping the bus to see when P1 requests the cache line). The net result is that even though the two threads T0 and T1 are accessing separate buckets, which are protected by separate latches, because the latches are on the same cache line, a false sharing of the latches occurs as a result of the cache line effects. The caching of the hash table is therefore less efficient than could otherwise be the case. The common approach to address this issue is to pad each bucket with unused bytes so that each bucket is on its own cache line. If there is an 8 byte bucket size and a 128 byte cache line, there will be 120 unused bytes for each bucket.
Another aspect of hash table operation that affects the efficiency of a hash table is the frequency of cache misses. A cache miss occurs when a thread or process seeks to access a bucket or the data items in the linked lists pointed to by the hash table but the item is not present in the cache. This is typically the case when a process or thread “walks” the linked list of data items. Accessing each item in the linked list will typically result in a CPU stall as a result of the cache line being accessed not being present in the data cache. This is inefficient as CPU cycles are wasted in waiting for the data items in the linked list (not cached) to be read from main memory.
It is therefore desirable to have a method and system for hash table creation and maintenance that permits hash tables to be dynamically resized and that provides improved performance of hash tables by minimizing cache line misses.