Numerous information exchange applications require implementation of cryptographic methods for data security. Typical examples include so-called smart cards and hand-held devices that require compact, inexpensive, and efficient cryptographic hardware and software. These applications can be based on cryptographic methods such as those using the RSA algorithm, the Diffie-Hellman key exchange algorithm, the Digital Signature Standard (DSS), elliptic curves, or other algorithms.
One characteristic of typical cryptographic systems is that very large numbers are used. For example, 128–256 bit numbers are commonly used in elliptic curve systems and 1024–2048 bit numbers are commonly used in systems based on exponentiation. Cryptographic hardware for modular multiplication is typically configured for use with numbers of a fixed precision and operands cannot exceed this precision. Increasing the bit length of cryptographic parameters to increase security requires additional hardware.
While fixed precision cryptographic hardware can be selected to support large numbers, such hardware is generally selected based on a so-called area-time tradeoff. While faster execution times are generally superior, the circuit area required to define hardware capable of fast execution can lead to unacceptably high hardware costs. Therefore, selection of a particular design is generally made based on circuit areas and execution times associated with several designs. Area-time tradeoffs are complicated by selection of a fixed point precision to accommodate potential increases in cryptographic parameter bit length.
Many important cryptographic systems require modular multiplication and modular exponentiation operations. The Montgomery multiplication (MM) method described in P. L. Montgomery, “Modular multiplication without trial division,” Mathematics of Computation, 44:519–521 (1985) provides a number of implementation advantages, and both software and hardware designs have been developed based on the MM method.
Scalable Montgomery multipliers are described in A. F. Tenca and C. K. Koc, “A scalable architecture for Montgomery multiplication,” in C. K. Koc and C. Paar, eds., Cryptographic Hardware and Embedded Systems: Lecture Notes in Computer Science 1717:94–108 (Springer, Berlin, 1999) and U.S. patent application Ser. No. 09/621,020, that are incorporated herein by reference. Such scalable multipliers can accommodate operands of various precisions and can be adjusted to occupy a selected circuit area. Scalable multipliers process long-precision numbers using lower precision operations. A hardware core of a fixed precision, usually at most 32 or 64 bits, is reused as directed by a multiplier controller. While scalable multipliers can use operands of arbitrary precision, the processing speeds associated with such multipliers are reduced due to signal broadcast to multiplier components. Accordingly, improved multiplication methods and apparatus are needed.