1. Field of the Invention
This invention relates in general to the field of microelectronics, and more particularly to an apparatus and method for performing hash functions on one or more message blocks to generate a message digest.
2. Description of the Related Art
An early computer system operated independently of other computer systems in the sense that all of the input data required by an application program executing on the early computer system was either resident on that computer system or was provided by an application programmer at run time. The application program generated output data as a result of being executed and the output data was generally in the form of a paper printout or a file which was written to a magnetic tape drive, disk drive, or other type of mass storage device that was part of the computer system. The output file could then be used as an input file to a subsequent application program that was executed on the same computer system or, if the output data was previously stored as a file to a removable or transportable mass storage device, it could then be provided to a different, yet compatible, computer system to be employed by application programs thereon. On these early systems, the need for protecting sensitive information was recognized and, among other information security measures, message digest generation application programs were developed and employed to protect the sensitive information from unauthorized disclosure. These application programs are also referred to as one-way hash functions, hash functions, compression applications, contraction functions, fingerprints, cryptographic checksums, message integrity checksums, and manipulation detection code. By whatever name, these applications typically take a variable length input string called a message or pre-image, and convert it to a fixed-length and generally smaller size output string called a hash or message digest.
Message digest generation functions have been employed by application programs in the information security area for many years and are used to verify the contents of a given string of data, or a file, or of many files stored on, say, a hard disk or magnetic tape. For example, consider sending a file to a someone else over the Internet. If that file contains financial, contractual, legal, or any other type of data that is important for both sender and receiver to know with high probability that it hasn't been tampered with, then the sender would perform a hash of the file and would send the message digest to the recipient along with the file itself. If the file has been changed in any way during transmission, when the recipient performs the same hash (i.e., executes the same hash function as the sender performed) of the file upon receipt, then the message digest generated upon receipt will not match that which was sent and thus, it is known that the contents of the file have changed since they were sent. Of course, it is possible for the file to be attacked in such a manner as to change both the message and the hash so that the altered hash matches the altered message. In such a case, the attack would be successful. This is why information security protocols utilize, in addition to message digest generation functions, other techniques to protect information such as encryption, secure authentication, and the like. A detailed discussion of these techniques, however, is beyond the scope of this application.
Hash functions are very useful because they are one-way functions. No cryptographic key is required for their use and the output (“message digest” or “hash”) is not dependent upon the input (“message” or “pre-image”) in any discernable way. Bruce Schneier notes in his work Applied Cryptography: Protocols, Algorithms, and Source Code in C [1996. John Wiley & Sons: New York], that “[a] single bit change in the pre-image changes, on the average, half of the bits in the hash value. Given a hash value, it is computationally infeasible to find a pre-image that hashes to that value.”
With the advent of computer networks and more advanced data transmission protocols, the probability for unauthorized access of sensitive files has dramatically increased. In fact, today's network architectures, operating systems, and data transmission protocols have evolved to the extent that the ability to access shared data is not only supported, but is prominently featured. For example, it is commonplace today for a user of a computer workstation to access files on a different workstation or network file server, or to utilize the Internet to obtain news and other information, or to transmit and receive electronic messages (i.e., email) to and from hundreds of other computers, or to connect with a vendor's computer system and to provide credit card or banking information in order to purchase products from that vendor, or to utilize a wireless network at a restaurant, airport, or other public setting to perform any of the aforementioned activities. Therefore, the need to protect sensitive data and transmissions from unauthorized tampering has grown dramatically. The number of instances during a given computer session where a user is obliged to validate or verify his or her sensitive data has substantially increased. Current news headlines regularly bring computer information security issues such as spam, spyware, adware, hacking, identity theft, spoofing, and credit card fraud to the forefront of public concern. And since the motivation for these invasions of privacy range all the way from innocent mistakes to premeditated cyber terrorism, responsible agencies have responded with new laws, stringent enforcement, and public education programs. Yet, none of these responses has proved to be effective at stemming the tide of computer information compromise. Consequently, what was once the exclusive concern of governments, financial institutions, the military, and spies has now become a significant issue for the average citizen who reads their email or accesses their checking account transactions from their home computer. On the business front, one skilled in the art will appreciate that corporations from small to large presently devote a remarkable portion of their resources to the validation and verification of proprietary information.
Within the field of cryptography, several procedures and protocols have been developed that allow for users to perform hash operations without requiring great knowledge or effort and for those users to be able to transmit or otherwise provide their information products in along with a corresponding message digest to different users. One skilled in the art will appreciate that these procedures and protocols generally take the form mathematical algorithms which application programs specifically implement to accomplish a hash of sensitive information.
Several algorithms are currently used to perform digital hash functions. These include, but are not limited to, the Secure Hash Algorithm (SHA), N-Hash, Snerfu, MD2, MD4, MD5, Ripe-MD, Haval, and one-way hash functions that employ symmetric key or public-key algorithms such as CBC-MAC, which uses the Cipher Block Chaining mode of the Advanced Encryption Standard (AES) as its hash function. As noted, there are a number of hash functions which are readily available for use in the public sector, but only one of these algorithms—SHA—has seen extensive use. This is primarily because the U.S. Government has adopted SHA as the standard hash algorithm for use across all U.S. government agencies. This standard hash algorithm is specified in the Federal Information Processing Standards Publication 180-2, dated Aug. 1, 2002, and entitled Secure Hash Standard, which is herein incorporated by reference for all intents and purposes. This standard is available from the U.S. Department of Commerce, National Institute of Standards and Technology, Washington, D.C. Currently, SHA comprises four hash modes: SHA-1, SHA-256, SHA-384, and SHA-512.
According to SHA, a message (i.e., “input text”) is divided into blocks of a specified size for purposes of performing a hash function. For example, a SHA-1 hash is performed on message blocks which are 512 bits in size, using a 32-bit word size, and which generates a 160-bit message digest. A SHA-256 hash is performed on message blocks which are 512 bits in size, using a 32-bit word size, and generates a 256-bit message digest. A SHA-384 hash is performed on message blocks which are 1024 bits in size, using a 64-bit word size, and generates a 384-bit message digest. And a SHA-512 hash is performed on message blocks which are 1024 bits in size, using a 64-bit word size, and generates a 512-bit message digest. In all cases, an initial hash value is set and is modified after processing each message block. This modified hash value is known as an intermediate hash value. The value of the hash following processing of the last message block is the message digest.
All of the SHA modes utilize the same type of sub-operations to perform hash of a message block such as bitwise logical word operations (AND, OR, NOT, Exclusive-OR), modulo addition, bit shift operations, bit rotate operations (i.e., circular shift). Different combinations of these operations are employed to generate the intermediate hash values according to the different SHA modes. Other hash algorithms utilize slightly different sub-operations and combinations of sub-operations, yet the sub-operations themselves are substantially similar to those of SHA because they are employed in a similar fashion to transform one or more message blocks into a corresponding message digest.
One skilled in the art will appreciate that there are numerous application programs available for execution on a computer system that can perform hash operations, and a great number are available for performing hashes according to SHA. In fact, some operating systems (e.g. MICROSOFT® WINDOWSXP®, LINUX®) provide direct message digest generation services in the form of hash primitives, hash application program interfaces, and the like. The present inventors, however, have observed that present day computer hash techniques are deficient in several respects. Thus, the reader's attention is directed to FIG. 1, whereby these deficiencies are highlighted and discussed below.
FIG. 1 is a block diagram 100 illustrating present day computer message digest applications. The block diagram 100 depicts a first computer workstation 101 connected to a local area network 105. Also connected to the network 105 is a second computer workstation 102, a network file storage device 106, a first router 107 or other form of interface to a wide area network (WAN) 110 such as the Internet, and a wireless network router 108 such as one of those compliant with IEEE Standard 802.11. A laptop computer 104 interfaces to the wireless router 108 over a wireless network 109. At another point on the wide area network 110, a second router 111 provides interface for a third computer workstation 103.
As alluded to above, a present day user is confronted with the issue of computer information security many times during a work session. For example, under the control of a present day multi-tasking operating system, a user of workstation 101 can be performing several simultaneous tasks, each of which require hash operations. The user of workstation 101 is required to run a hash application 112 (either provided as part of the operating system or invoked by the operating system) to generate a message digest for a local file which is then stored on the network file storage device 106. Concurrent with the file storage, the user can transmit an file and corresponding message digest to a second user at workstation 102, which also requires executing an instance of the hash application 112. In addition, the user can be accessing or providing his/her financial data (e.g., credit card numbers, financial transactions, etc.) or other forms of sensitive data over the WAN 110 from workstation 103, which requires additional instances of the hash application 112. Workstation 103 could also represent a home office or other remote computer 103 that the user of workstation 101 employs when out of the office to access any of the shared resources 101, 102, 106 on local area network 105. Each of these aforementioned activities requires that a corresponding instance of the hash application 112 be invoked. Furthermore, wireless networks 109 are now being routinely provided in coffee shops, airports, schools, and other public venues, thus prompting a need for a user of laptop 104 to hash not only his/her files (or other forms of data) to/from other users, but to employ hash functions for data that is transmitted over the wireless network 109 to the wireless router 108.
One skilled in the art will therefore appreciate that along with each activity that requires hash operations at a given workstation 101-104, there is a corresponding requirement to invoke an instance of the hash application 112. Hence, a computer 101-104 in the near future could potentially be performing hundreds of concurrent hash operations.
The present inventors have noted several limitations to the above approach of performing hash operations by invoking one or more instances of a hash application 112 on a computing system 101-104. For example, performing a prescribed function via programmed software is exceedingly slow compared to performing that same function via dedicated hardware. Each time the hash application 112 is required, a current task executing on a computer 101-104 must be suspended from execution, and parameters of the hash operation (i.e., message, hash algorithm, hash mode, etc.) must be passed through the operating system to the instance of the hash application 112, which is invoked for accomplishment of the hash operation. And because hash algorithms necessarily involve the execution of numerous of sub-operations on a particular block of data (i.e., message block), execution of the hash applications 112 involves the execution of numerous computer instructions to the extent that overall system processing speed is disadvantageously affected.
In addition, current techniques are limited because of the delays associated with operating system intervention. Most application programs do not provide integral message digest generation components; they employ components of the operating system or plug-in applications to accomplish these tasks. And operating systems are otherwise distracted by interrupts and the demands of other currently executing application programs.
Furthermore, the present inventors have noted that the accomplishment of hash operations on a present day computer system 101-104 is very much analogous to the accomplishment of floating point mathematical operations prior to the advent of dedicated floating point units within microprocessors. Early floating point operations were performed via software and hence, they executed very slowly. Like floating point operations, hash operations performed via software are disagreeably slow. As floating point technology evolved further, floating point instructions were provided for execution on floating point co-processors. These floating point co-processors executed floating point operations much faster than software implementations, yet they added cost to a system. Likewise, message digest co-processors or cores exist today in the form of add-on boards or external devices that interface to a host processor via parallel ports or other interface buses (e.g., USB). These co-processors certainly enable the accomplishment of hash operations much faster than pure software implementations. But hash co-processors add cost to a system configuration, require extra power, and decrease the overall reliability of a system. In addition, hash co-processor implementations are vulnerable to snooping because the data channel is not on the same die as the host microprocessor.
Therefore, the present inventors recognize a need for dedicated hash hardware within a present day microprocessor such that an application program that requires a hash operation can direct the microprocessor to perform the hash operation via a single, atomic, hash instruction. The present inventors also recognize that such a capability should be provided so as to preclude requirements for operating system intervention and management. Also, it is desirable that the hash instruction be available for use at an application program's privilege level and that the dedicated hash hardware comport with prevailing architectures of present day microprocessors. There is also a need to provide the hash hardware and associated hash instruction in a manner that supports compatibility with legacy operating systems and applications. It is moreover desirable to provide an apparatus and method for performing hash operations that are resistant to unauthorized observation, that can support and are programmable with respect to multiple hash algorithms, that support verification and testing of the particular hash algorithm that is embodied thereon, that are self-padding, that support multiple message block sizes, and that provide for programmable hash algorithm mode such as SHA-1, SHA-256, SHA-384, and SHA-512, for example.