The present invention relates to a method of hashing a multi-bit input to a shorter bit-length output, where parameters of the hashing function used can be varied.
Hashing is a process by which a relatively long input is transformed into a shorter output. Hashing is used for many different purposes including memory look-up and for forming message authentication codes. A large number of different hashing functions are known.
FIG. 1 of the accompanying drawings illustrates a typical application of hashing in the context of memory look-up. An input, for example a 48-bit input 10, is to be used to retrieve data held in memory 11. In the current example, only a relatively small number of the possible values of the input 10 are of interest (for example, seven values). It would therefore be highly inefficient to have a respective location 12 in memory 11 for each possible value of the input 10. Instead, the memory is provided with a number of locations 12 equal, to or greater than, the number of input values of interest (but less than all possible input values), and the input 10 is applied to a hashing function 13 that transforms the input 10 into a shorter length index 14 matched to the size of the memory (in the current example, this might be a 3-bit index). The hashing function 14 is chosen such that every input value of interest produces a unique value of the index 14 and this latter value is then used to access a corresponding location of the memory 11xe2x80x94this location being that used to store the data that it is desired to associate with the input value concerned.
Of course, many values of the input 10 will give rise to the same index value but it is only desired to register a xe2x80x9chitxe2x80x9d (a valid output of data from the memory 11) for the input value of interest that hashes to that index value. Accordingly, along with the data to be output, each memory location stores a check value corresponding to the input value for which that data is valid. When a memory location 12 is accessed, the check value is compared in comparator 15 with the current value of the input 10 and a data valid output 16 is only produced if there is a match. If the check value and input value do not match, a xe2x80x9cmissxe2x80x9d occurs. In so-called xe2x80x9cperfectxe2x80x9d hashing where every input value of interest is mapped by the hashing function 13 into its own unique value of index, this is the end of the process; however, it is often more efficient to practice xe2x80x9cnon-perfectxe2x80x9d hashing in which there are some clashes between values of interest. In such cases, various strategies have been devised; for example, the indexed location may give access to a linked list of values of interest associated with that location. Another strategy is to effect xe2x80x9cre-hashingxe2x80x9d (that is, carrying out a hash retry) generally using a different hash function from that originally employed. Re-hashing strategies may extend to multiple re-hashing attempts each with a different predetermined hash function.
Generally, in any particular application the hash function is predetermined though, as already indicated, it is also known to have a set of predetermined hash functions from which the function to be used is chosen.
It is an object of the present invention to provide a more flexible hashing arrangement.
According to the present invention, there is provided a method of hashing a multi-bit input to a shorter bit-length output, the method involving the steps of:
(a)xe2x80x94receiving said input and a hash control code,
(b)xe2x80x94forming a transformed value by combining together at least portions of different versions of said input, said hash control code serving to set specific parameters of this forming, and
(c)xe2x80x94using said transformed value to provide said output.
Preferably, the different versions of the input are bit-shifted versions of the input, the number of these different versions and their respective degrees of bit shifting being determined by the hash control code. Advantageously, the hash control code is a multi-bit code, with the state of each of at least some of the bits of the hash control code determining the presence or absence, in the step (b) combining, of a version of the input that corresponds to the bit shifting of the input by an amount dependent on the position of the relevant hashing control-code bit in that code. It should be appreciated that the aforesaid different versions (25) of the input may not be explicitly formed as separate bit strings but can be derived as needed (in whole or in part) by appropriately offset accesses to the original input (that is, by appropriate wiring into a register holding the input). Step (b) may involve combining all or only portions of said different versions of the input to produce the transformed value with multiple bits. Depending on the size of the transformed value, step (c) may involve selecting bits from the transformed value to produce the output.
The hashing method is advantageously used as part of a memory look-up method for accessing stored items which are held in respective locations in memory and which are associated with particular values of a multi-bit input. In one application of the invention, the memory look-up method is used in the course of execution of an instruction to retrieve an element from memory in dependence on the value of particular input data subject of the currently-executing instruction. The hashing method of the invention is suited to such an application because the identity of the element to be retrieved will often occupy a search space that varies in size and degree of population between different instructions, and it is therefore very useful to be able to change the hash function from one instruction to another to match the search space characteristics. When the search space characteristics are the same for different instructions, then the memory look-up method can advantageously be invoked as a called routine returning a value that is indicative of the location of the item to be retrieved in the context of the current instruction.
According to another aspect of the present invention, there is provided apparatus for hashing a multi-bit input to a shorter bit-length output, the apparatus comprising:
means for receiving said input and a hash control code,
means for forming a transformed value by combining together at least corresponding portions of multiple bit-shifted versions of said input, the number of such versions and their respective degrees of bit-shifting being determined by said hash control code, and
means for using said transformed value to provide said output.
Preferably, the different versions of the input are bit-shifted versions of the input, the number of these different versions and their respective degrees of bit shifting being determined by the hash control code; in this case, the means for forming a transformed value may comprise:
first register means for storing the input, the first register means being accessible over its length so as to permit access to said versions by appropriately offset access into the first register means;
second register means for storing said hash control code;
exclusive-OR means; and
gate means connected to said first and second register means for controlling the passing to said exclusive-OR means of bits of said different versions of the input in dependence on the value of the hash control code, the output of said exclusive-OR means constituting said transformed value.