In data processing systems, a hash table is a data structure that can map keys to values. A key can be any block of data in a data processing system. The mapped value is information about a block of data that the data processing system stores for future use. In order to use a hash table, memory can be allocated at which an array of hash table entries of equal size are stored. Information about blocks of data is stored in the hash table entries. A hash function is used by the data processing system to determine which entry in the hash table contains information about a particular block of data. To use a hash function, the data processing system provides the block of data, e.g., the key, as an input to the hash function. The hash function computes a value based on the block of data. This result is used as the address of a hash table entry in the hash table. A common implementation is to treat the hash table in memory as an array and to interpret a result from a hash function as an index into the hash table array.
Ideally, a hash function will produce a unique result for each block of data provided as an input. In practice, however, most hash functions will for some different blocks of data generate the same result. This situation is referred to as a hash collision. One method of dealing with collisions is to create hash chains by creating a separate hash table entry for each block of data that causes a collision and linking the entries having the same hash function result to each other as a linked list data structure. Thus, information associated with a second data block having the same hash function result is stored at a second hash table entry that is unused, e.g., no hash function result has been generated that is equal to the index to the unused entry, and the index of the second hash table entry is stored into a field of the first hash table result. Information for subsequent data blocks having the same hash function result is stored in a similar way at unused hash table entries, with the address of a new hash table entry stored in a field of an already existing hash table entry in the chain.
During continued operation of a data processing system, hash table entries may need to be deleted from a hash table. Because multiple entities can be accessing a hash table in parallel, cleanup routines are called that scan through a hash table linearly, and removes those hash table entries marked for deletion. One issue with such a hash table clean-up routine is that the routine is not guaranteed to remove all deleted entries from a particular hash chain without locking the entire hash table, during which time the hash table can not be accessed. Additionally, having to call the hash routine requires additional attention by the application using the hash tables. Because hash table performance may affect the overall performance of a data processing system, an effective ability to delete hash table entries is desirable.
The use of the same reference symbols in different drawings indicates similar or identical items.