A hash function is a function that receives input data of an arbitrary bit length and generates an output of a fixed bit length, where the length of the output is user-definable. Hash functions are useful for generating message authentication codes and Bloom filters for determining if a data element is a member of a set.
Since a hash function maps data of a certain length to data of a shorter length, there are fewer possible outputs then there are inputs. So, some inputs will map to the same output. Such a mapping is commonly referred to as a collision. Knowing what inputs to a hash function cause collisions could provide a person with information that would help that person compromise a cryptographic algorithm that uses the hash function.
Since collisions are inherent in any hash function that receives a larger input then the output it produces, one cannot totally eliminate collisions. However, one may make it more time consuming to find collisions by ensuring that collisions occur only for inputs that differ from each other by more than a trivial number of bit locations so that one must spend more time searching for inputs that cause collisions. Therefore, there is a need for a hash function that does not produce collisions for inputs that are near matches of each other, where a near match is one where the number of bit locations that differ is small and, therefore, could more easily be found than if the inputs were not near-matches.
U.S. Pat. No. 7,382,876, entitled “HASH FUNCTION CONSTRUCTION FROM EXPANDER GRAPHS,” discloses a hash function in which it is difficult to find collisions by dividing an input to a hash function into segments, walking an expander graph based on respective input segments, determining a label of the last vertex walked, and outputting the label as the result of the hash function. U.S. Pat. No. 7,382,876 is hereby incorporated by reference into the present specification.
U.S. Pat. Appl. No. 20070291934, entitled “METHODS, SYSTEMS AND COMPUTER PROGRAM FOR POLYNOMIAL BASED HASHING AND MESSAGE AUTHENTICATION CODING WITH SEPARATE GENERATION OF SPECTRUMS,” discloses a hash function that represents an initial sequence of bits as a specially constructed set of polynomials, transforms the set by masking, partitions the transformed set into a plurality of classes, forms a bit string during partitioning, factoring for each class each of the polynomials, collecting the factors, wrapping the factors, organizing the wrappings, and performing an exponentiation of the organizations to obtain a hash value. U.S. Pat. Appl. No. 20070291934 is hereby incorporated by reference into the present specification.
U.S. Pat. Appl. No. 20090067620, entitled “CRYPTOGRAPHIC HASHING DEVICE AND METHOD,” discloses a hash function that forms a sequence of data m-tuples from a message, where m is a positive integer, interatively calculating successive output p-tuples, where p is a positive integer corresponding to the sequence of data m-tuples as a function of at least one set of multivariate polynomials defined over a finite field, and determining a hash value as a function of the last p-tuple output. U.S. Pat. Appl. No. 20090067620 is hereby incorporated by reference into the present specification.
U.S. Pat. Appl. No. 20090085780, entitled “METHOD FOR PREVENTING AND DETECTING HASH COLLISIONS OF DATA DURING DATA TRANSMISSION,” discloses a means for avoiding hash collisions by pre-processing a message to increase randomness and reducing redundancy in a manner that includes a bit shuffler, a compression T-function, and a linear feedback shift register. U.S. Pat. Appl. No. 20090085780 is hereby incorporated by reference into the present specification.