Secure hash functions are often used to authenticate messages. Digital signatures, for instance, rely on secure hash functions. A person can digitally sign a document by signing a hash of the document computed using a secure hash function. Later, the digital signature can be authenticated by computing, with the same hash function, a hash of a document purported to be the document that the person signed. If the first hash and the second hash are identical, the documents are deemed identical. If the documents are identical, the digital signature is authentic.
The two documents are deemed identical—rather than known to be identical—because it may be possible for hashes of two different documents to be the same. This is called a “collision”.
An example of a collision can be shown in mathematic terms. Assume that a secure hash function “H(M)” is capable of operating on an arbitrary-length message “M” and return a fixed-length hash “h”. Thus, “h=H(M)”, where “h” has a fixed length. This leaves open the possibility, however, that if the message “M1” is larger than the fixed-length hash “h”, two different messages “M1” and “M2” can have equivalent hashes “h”, such that “H(M1)=H(M2)”. If “H(M1)=H(M2)” a collision has occurred.
The probability of a collision is important to ascertaining the probability that any particular message is authentic. For a secure hash function generating a hash of 160 bits, for instance, the probability that two random messages have identical hashes is one in 2160. But a group of random messages from which any two can have the same hash will not need to be nearly as large as one might expect. For it to be likely that any two messages of a group will have identical hashes, the group will only need to have about 280 messages for a 160-bit hash.
For this reason, a person attempting to cause a collision, i.e., cause two messages to have identical hashes, can produce such a collision in about 280 or fewer attempts with hashes of 160 bits. Assume, for example, that Willy wants to swindle George. Willy could write two contracts, one that is favorable to George and one that is very favorable to Willy. Willy can make subtle changes (like adding a space) to each document and run hash values for each. Willy can continue to do so until a hash value for one of the pro-Willy contracts matches a hash value for one of the pro-George contracts. By so doing, Willy can be able 11 create a collision between the pro-George contract, “Mg”, and the pro-Willy contract, “Mw”, so that “H(Mg)=H(Mw)”. Once Willy has done so, he gets George to sign the pro-George contract using a protocol in which George signs the hash value “h”. At some time in the future, Willy substitutes the pro-George contract is that George signed with the pro-Willy contract that George did not sign. Now Willy can convince an adjudicator (e.g., a judge in a court of law) that George signed the pro-Willy contract because a hash of the pro-Willy contract will match the hash value “h” of George's signature for the pro-George contract.
Causing such a collision with large hashes, like 160-bit hashes, was until recently considered very difficult. Altering and computing hashes for 280 messages would take hundreds of computers hundreds or thousands of years at current processing speeds. Recently, however, some have argued that a collision is possible with only 269 messages by making small, controlled changes to bits of a message. If this is true, a few hundred computers may be able cause a collision in a few months at current processing speeds; in five or ten years, perhaps one computer might be able to cause a collision in less than a year.
This particular possibility, as well as other attacks that make creating collisions potentially easier than is ideal, have caused people to doubt the security and usefulness of some secure hash functions.