A cryptographic hash function is a hash function, i.e. an algorithm that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an (accidental or intentional) change to the data will (with very high probability) change the hash value. The data to be encoded are often called the “message,” and the hash value is sometimes called the message digest or “digest.”
Cryptographic hash functions have many information security applications, notably in digital signatures, message authentication codes (MACs), and other forms of authentication. They can also be used as ordinary hash functions, to index data in hash tables, for fingerprinting, to detect duplicate data or uniquely identify files, and as checksums to detect accidental data corruption. In information security contexts, cryptographic hash values are sometimes called (digital) fingerprints, checksums, or just hash values, even though all these terms stand for functions with rather different properties and purposes.
Two major tradeoffs in cryptographic hash function design, as visible to a programmer, are: (1) complexity of calculation—too simple and the hash is easily broken, too complex and the hash takes too long to calculate; and (2) size of output—too small and brute-force attacks are too easy, too big and the cost of storing and sending the hash value is too large. One of the most famous cryptographic hash functions is the MD5 (Message-Digest algorithm 5) algorithm developed by Ronald Rivest. Other common algorithms are SHA-1 (Secure Hash Algorithm 1) as well as variants SHA-2 and SHA-3 published by the National Institute of Standards and Technology (NIST) as a U.S. Federal Information Processing Standard (FIPS). Another cryptographic hash algorithm of interest, which was invented by Xiaoyun Wang et al., is the SM3 cryptographic hash function—published by the Chinese Commercial Cryptography Administration Office and submitted as an Internet-Draft to the Internet Engineering Task Force (IETF) for the use of electronic authentication service system.
Commonly, hardware acceleration for hash algorithms is not required, because they are not designed to be especially computationally demanding. But one special purpose dynamic password enciphering chip is produced by Shenzhen Tongfang Electronic Equipment Co., Ltd. of China, which implements the SM3 cryptographic hash algorithm in hardware.
Typical straightforward hardware implementations using lookup memories, truth tables, binary decision diagrams or field-programmable gate arrays (FPGAs) are costly in terms of circuit area. Alternative approaches using finite fields isomorphic to GF(256) may be efficient in area but may also be slower than the straightforward hardware implementations.
One drawback to a complete hardware approach is that it is not easily fit into a standard execution pipeline of a modern microprocessor without making special considerations for such things as the handling of interrupts specially, or the concurrent superscalar execution of other instructions. Another mismatch with standard execution pipelines is the latency required for executing an entire hashing algorithm.
Modern processors often include instructions to provide operations that are computationally intensive, but offer a high level of data parallelism that can be exploited through an efficient implementation using various data storage devices, such as for example, single instruction multiple data (SIMD) vector registers. The central processing unit (CPU) may then provide parallel hardware to support processing vectors. A vector is a data structure that holds a number of consecutive data elements. A vector register of size M may contain N vector elements of size O, where N=M/O. For instance, a 64-byte vector register may be partitioned into (a) 64 vector elements, with each element holding a data item that occupies 1 byte, (b) 32 vector elements to hold data items that occupy 2 bytes (or one “word”) each, (c) 16 vector elements to hold data items that occupy 4 bytes (or one “doubleword”) each, or (d) 8 vector elements to hold data items that occupy 8 bytes (or one “quadword”) each. The nature of the parallelism in SIMD vector registers could be well suited for the handling of secure hashing algorithms.
To date, potential solutions to such complexities, mismatches, performance limiting issues, and other bottlenecks have not been adequately explored.