A cache is often implemented with a hash map. FIG. 1a shows a traditional hash map implementation. Typically, a software routine uses various items of data over the course of its execution. Each item of data that may be stored in the cache is assigned a unique “key”. FIG. 1a shows a depiction of such an construction in which a key K1 102 and item of data D1 103 are appended together as a data structure 101. The key K1 is used to search for the item of data D1 in the cache.
Because the key K1 can be a random value and because the cache's resources are tied to memory having 105 a range of addressing space 106 and corresponding data space 107, the various keys associated with the various items of data must be able to “map” to the cache's addressing space 106. A hash function 104 is used to perform this mapping. According to the depiction of FIG. 1a, the hashing function 104 produces address value A1 in response to an input key value K1.
Thus, in order to store the data D1 in cache, its key K1 is provided to a hashing function 104. The hashing function 104 produces the appropriate address A1. The data structure containing the key K1 and the data D1 are then stored in the memory resources used to implement the cache (hereinafter referred to as a “table” 105 or “hash table” 105) at the A1 address.
Frequently the number of separate items of data which “could be” stored in the hash table 105 is much greater than the size of the hash table's addressing space 106. The mathematical behavior of the hashing function 104 is such that different key values can map to the same hash table address. For example, as depicted in FIG. 1b both of key values K1 and K2 map to the same address value A1 108. Various hash mapping techniques have been developed to handle the situation where different key values map to the same hashing function output value.
FIG. 1b shows a first approach that involves multiple hash tables. When a “collision” occurs, that is, when an attempt is made to store a second data structure that maps to a table location (also referred to as a “slot”) where a first data structure already resides (because the key values K1, K2 for the pair of data structures map to the same table address A1 and the first data structure was stored into the hash table before the second), the second data structure is stored into a next, “deeper” hash table. Here, the first table that is looked to is referred to the primary table 109 and the second table that is looked to is referred to as the secondary table 110.
As an example, consider the situation depicted in FIG. 1b in which an attempt is made to store a second data structure having key K2 into a cache at a moment in time when a first data structure having a key K1 is already located in the primary table 109 of the cache at the address A1 that both key K1 and key K2 map to. In attempting to store the second data structure, its key K2 is hashed by the hashing function 111 to produce the corresponding address A1. The primary table 109 is looked to first. As such, the slot at address A1 of the primary table 109 is accessed first. When it is discovered by way of this access that the A1 slot in the primary table 109 is already populated with the first data structure (having key K1), a re-hash operation is performed with a second hashing function 112.
The value produced by the second hashing function 112 produces a second address value AS1 from the K2 key that is to be used for accessing the secondary table 110. According to this example, the AS1 slot is empty and the second data structure is therefore stored in the AS1 slot of the secondary table 110. Depending on implementation, hash functions 111 and 112 may be the same hashing function or may be different hashing functions. Reading/writing a data structure from/to the primary table 109 should consume less time than the reading/writing a data structure from/to the secondary table 110 because at least an additional table access operation is performed if not an additional hash function operation.
FIG. 1c shows another technique referred to as linear probing. According to the linear probing technique, rather than use a secondary table, when a collision occurs, an offset ΔA is summed with the address A1 produced by the hashing function to produce a second slot address A1+ΔA in the hash table 113 where the data structure that seeks to be inserted into the table 113 can be placed.
According to the exemplary depiction of FIG. 1c, when an attempt is made to insert a second data structure having key K2 into the hash table 113, a hash on the K2 value is performed which generates the address A1. When the A1 slot is accessed it is realized that a first data structure (having key K1 and data D1) is already stored there. As such, the offset ΔA is summed with the A1 address to produce a next address of A1+ΔA. When the A1+ΔA address value is access it is found to be empty and the second data structure is stored there 114.