1. Field of the Invention
The present invention relates to data manipulation; more specifically, the invention relates to an efficient technique for representing long strings of data as shorter strings of data.
2. Description of the Related Art
Hashing is a technique for representing longer lengths of data as shorter lengths of data. The techniques are such that there is a relatively small probability that two different longer lengths of data will be represented as identical short lengths of data. The feature is called a probability of collision.
                                                                        Pr                ⁡                                  (                                                            h                      ⁡                                              (                                                  m                          1                                                )                                                              =                                          h                      ⁡                                              (                                                  m                          2                                                )                                                                              )                                            ⁢                              <                _                            ⁢              ɛ                                                                          ɛ              ⁢                              >                _                            ⁢                              1                                  2                  ′                                                                                        (        1        )            
The probability of collision is represented by Equation (1) which indicates that the probability of a hashing function “h” performed on a string m1 being equal to the result of a hashing function “h” performed on a string m2 being less than or equal to
      1          2      ′        ⁢          ⁢  or  ⁢          ⁢      ɛ    .  
The number of bits contained in the longer unhashed string is “n”. The number of bits in the shorter or hashed string is “l”. A hashing function that satisfies Equation (1) is often referred to as ε universal.
                                                                        Pr                ⁡                                  (                                                            h                      ⁡                                              (                                                  m                          1                                                )                                                              =                                                                  h                        ⁡                                                  (                                                      m                            2                                                    )                                                                    =                      Δ                                                        )                                            ⁢                              <                _                            ⁢              ɛ                                                                          ɛ              ⁢                              >                _                            ⁢                              1                                  2                  ′                                                                                        (        2        )            
Another property typically associated with hashing functions is represented by Equation (2) where it indicates that the probability of the difference between the output of a hashing function “h” on string x1 and the output of a hashing function on string x2 being equal to some preselected number Δ is less than or equal to
  1      2    ′  or ε. Hashing functions that satisfy Equation (2) are typically referred to as εΔ universal hash functions.
                                                                        Pr                ⁡                                  (                                                                                    h                        ⁡                                                  (                                                      m                            1                                                    )                                                                    =                                              c                        1                                                              ,                                                                  h                        ⁡                                                  (                                                      m                            2                                                    )                                                                    =                                              c                        2                                                                              )                                            ⁢                              <                _                            ⁢                              ɛ                                  2                  ′                                                                                                        ɛ              ⁢                              >                _                            ⁢                              1                                  2                  ′                                                                                        (        3        )            
Some hash functions also have a third property illustrated by Equation (3). Equation (3) shows that the joint probability of the output of hashing function “h” for input string x1 being equal to a predetermined number c1 and the output of hashing function “h” for input string x2 being equal to predetermined number c2 is less than
  1      2    ′  or ε. A hashing function that satisfies Equation (3) is referred to as ε strongly universal. Hashing functions that satisfy Equation (3) automatically satisfy Equations (1) and (2).
Hashing functions are used in many applications, one of which is to simplify searching for text strings. When used for searching for text strings, the hashing function is used to reduce the size of the stored information and then the same hashing function is used to reduce the size of the search criteria. The shortened search criteria is then used to search for the shortened stored information to more efficiently locate a desired piece of information. Once the desired piece of information has been located, the unhashed or full length text associated with the shorted text can be provided.
Hashing functions are also used in wireless communications for message authentication. A message is authenticated by sending a message string along with a tag, calculated by performing a cryptographic function on the message. Forming a tag of a message string is computationally intensive. Hash functions are used to shorten the message to a tag so that the cryptographic processing required is less intense.h(m)=(ma) mod p  (4)
                              h          ⁡                      (                                          m                1                            ,              …              ⁢                                                          ,                              m                k                                      )                          =                              (                                          ∑                                  i                  =                  1                                K                            ⁢                                                          ⁢                                                m                  i                                ⁢                                  a                  i                                                      )                    ⁢          mod          ⁢                                          ⁢          p                                    (        5        )            
Techniques such as linear hashing illustrated by Equation (4) and MMH hashing illustrated by Equation (5) are now used to represent longer strings of data or text as shorter strings where the probability of two different long strings producing the same short string is relatively small. These hashing functions require a multiplication of a key that is “w” words long by a “w” words long message or text that is to be hashed. As a result, w2 operations are required to perform a hashing of a particular string of data or text. For large strings of data or text having many words, this results in a computationally intensive operation.