1. Field
The present disclosure relates to hash table data structures. More particularly, the disclosure concerns adaptive hash table resizing for hash tables that support concurrent access by readers and writers using the read-copy update synchronization mechanism.
2. Description of the Prior Art
By way of background, hash tables provide useful data structures for many applications, with various convenient properties such as constant average time for accesses and modifications. When a hash table is shared for reading and writing by concurrent applications, a suitable synchronization mechanism is required to maintain internal consistency. One technique for supporting concurrent hash table access comes in the form of Read-Copy Update (RCU). RCU is a synchronization mechanism with very low overhead for readers, and thus works particularly well for data structures with significantly more reads than writes, such as hash tables. These properties allow RCU-protected hash tables to scale well to many threads on many processors.
RCU-protected hash tables are implemented using open chaining, with RCU-protected linked lists being provided for the hash buckets. Readers traverse these linked lists without using locks, atomic operations or other forms of mutual exclusion. Writers performing updates to hash table elements protect the readers by waiting for a grace period to elapse before freeing any stale data that the readers may have been referencing.
A challenge respecting RCU-protected hash tables is the need to support efficient hash table resizing. The ability to dynamically resize a hash table stems from the fact that the performance and suitability of hash tables depend heavily on choosing the appropriate size for the table. Making a hash table too small will lead to excessively long hash chains and poor performance. Making a hash table too large will consume too much memory, reducing the memory available for other applications or performance-improving caches, and increasing hardware requirements. Many systems and applications cannot know the proper size of a hash table in advance. Software designed for use on a wide range of system configurations with varying needs may not have the option of choosing a single hash table size suitable for all supported system configurations. Furthermore, the needs of a system may change at run time due to numerous factors, and software must scale both up and down dynamically to meet these needs. For example, in a system that supports virtual computing environments, the ability to shrink a hash table can be particularly important so that memory can be reallocated from one virtual environment to another.
Resizing an RCU-protected hash table so as to either increase or decrease the hash table size results in hash buckets being respectively added to or removed from the hash table, with a corresponding change being made to the hash function. This usually entails one or more hash table elements having to be relocated to a different hash bucket, which can be disruptive to readers if care is not taken to protect their operations during the resizing operation. Existing RCU-protected hash tables support reader-friendly hash table resizing using several approaches. However, there are shortcomings that are variously associated with these approaches, such as (1) the need to maintain duplicate sets of per-element list links, thereby increasing the hash table memory foot print, (2) the need to incur large numbers of grace period delays and require readers to search two hash table versions during resizing, and (3) the need to copy data hash table elements, which makes it difficult or impossible for readers to maintain long-lived references to such elements. The present disclosure presents a new technique that enables optimized resizing of RCU-protected hash tables while permitting concurrent read access without any of the above deficiencies.