This application is related to co-pending and commonly assigned application Ser. No. 08/828,368, entitled xe2x80x9cHigh-Speed Modular Exponentiator,xe2x80x9d by Gregory A. Powell, Mark W. Wilson, Kevin Q. Truong, and Christopher P. Curren, filed Mar. 28, 1997, now U.S. Pat. No. 6,282,290 which application is hereby incorporated by reference herein.
This application is also related to co-pending and commonly assigned application Ser. No. 09/050,573, entitled xe2x80x9cHigh Speed Montgomery Value Calculation,xe2x80x9d by Matthew S. McGregor, filed Mar. 30, 1998, now U.S. Pat. No. 6,240,436 which application is also hereby incorporated by reference herein.
1. Field of the Invention
The present invention relates to cryptographic systems, and more particularly, to a highly efficient multiplier for performing modular reduction operations integral to cryptographic key calculations.
2. Description of Related Art
Cryptographic systems are commonly used to restrict unauthorized access to messages communicated over otherwise insecure channels. In general, cryptographic systems use a unique key, such as a series of numbers, to control an algorithm used to encrypt a message before it is transmitted over an insecure communication channel to a receiver. The receiver must have access to the same key in order to decode the encrypted message. Thus, it is essential that the key be communicated in advance by the sender to the receiver over a secure channel in order to maintain the security of the cryptographic system; however, secure communication of the key is hampered by the unavailability and expense of secure communication channels. Moreover, the spontaneity of most business communications is impeded by the need to communicate the key in advance.
In view of the difficulty and inconvenience of communicating the key over a secure channel, so-called public key cryptographic systems are proposed in which a key may be communicated over an insecure channel without jeopardizing the security of the system. A public key cryptographic system utilizes a pair of keys in which one is publicly communicated, i.e., the public key, and the other is kept secret by the receiver, i.e., the private key. While the private key is mathematically related to the public key, it is practically impossible to derive the private key from the public key alone. In this way, the public key is used to encrypt a message, and the private key is used to decrypt the message.
Such cryptographic systems often require computation of modular exponentiations of the form y=be mod n, in which the base b, exponent e and modulus n are extremely large numbers, e.g., having a length of 1,024 binary digits or bits. If, for example, the exponent e were transmitted as a public key, and the base b and modulus n were known to the receiver in advance, a private key y could be derived by computing the modular exponentiation. It would require such a extremely large amount of computing power and time to factor the private key y from the exponent e without knowledge of the base b and modulus n, that unauthorized access to the decrypted message is virtually precluded as a practical matter.
A drawback of such cryptographic systems is that calculation of the modular exponentiation remains a daunting mathematical task even to an authorized receiver using a high speed computer. With the prevalence of public computer networks used to transmit confidential data for personal, business and governmental purposes, it is anticipated that most computer users will want cryptographic systems to control access to their data. Despite the increased security, the difficulty of the modular exponentiation calculation will substantially drain computer resources and degrade data throughput rates, and thus represents a major impediment to the widespread adoption of commercial cryptographic systems.
Accordingly, a critical need exists for a high speed modular exponentiation method and apparatus to provide a sufficient level of communication security while minimizing the impact to computer system performance and data throughput rates.
In accordance with the teachings of the present invention, a highly efficient method and apparatus is disclosed for performing operations required for modular exponentiation. The apparatus is especially well suited for implementing multiplications using the Montgomery algorithm.
The efficient multiplier architecture uses a preload register, coupled to a multiplier at a second input port via a KN bit bus to load the value of the xe2x80x9caxe2x80x9d multiplicand in the multiplier in a single clock pulse. The xe2x80x9cbxe2x80x9d multiplicand (which is also KN bits long) is supplied to the multiplier N bits at a time from a memory via an N bit bus coupled to a multiplier. The multiplier multiplies the N bits of the xe2x80x9cbxe2x80x9d multiplicand by the KN bits of the xe2x80x9caxe2x80x9d multiplicand and provides that product at a multiplier output N bits at a time, where it can be supplied to the memory.
The efficient multiplication method using the foregoing architecture is also described. The method begins by providing KN bits of the multiplicand xe2x80x9caxe2x80x9d from a preload register to a second multiplier input port in a single clock pulse. Then, N bits of the multiplicand xe2x80x9cbxe2x80x9d are provided to a first multiplier input port, also in a single clock pulse. The KN bits of the number xe2x80x9caxe2x80x9d are multiplied by the K bits of the number xe2x80x9cbxe2x80x9d until all of the KN bits of the xe2x80x9cbxe2x80x9d multiplicand are provided to the first multiplier input port and multiplied by the KN bits of the xe2x80x9caxe2x80x9d multiplicand. When completed, these operations result in an output number, which is then transmitted to the memory, where it can be made available for further processing.
In accordance with the deterministic behavior of the Montgomery algorithm, one embodiment of the present invention loads a predicted (future) value for multiplicand xe2x80x9caxe2x80x9d into the preload register while multiplication operations on the current xe2x80x9caxe2x80x9d and xe2x80x9cbxe2x80x9d multiplicands are being performed. This technique further reduces the clock cycles necessary to load and multiply the parameters.
A more complete understanding of the computationally efficient multiplier will be afforded to those skilled in the art, as well as a realization of additional advantages and objects thereof, by a consideration of the following detailed description of the preferred embodiment. Reference will be made to the appended sheets of drawings which will first be described briefly.