Performing mathematical operations on large numbers can be a time-consuming and resource-intensive process. One method of handling large numbers involves dividing the numbers into smaller divisions, or words, having a fixed length. Numbers divided in this manner are termed “multi-precision” numbers. In the field of digital circuits, for instance, the binary representation of a large number can be stored in multiple words, wherein each word has a fixed length of n bits depending on the word size supported by the associated hardware or software. Although adding and subtracting multi-precision numbers can be performed relatively efficiently, multi-precision multiplication is much more complex and creates a significant bottleneck in applications using multi-precision arithmetic.
One area that is affected by the complexity of multi-precision multiplication is cryptography. Many cryptographic algorithms, including the Diffie-Hellman key exchange algorithm, elliptic curve cryptography, and the Elliptic Curve Digital Signature Algorithm (ECDSA), involve the multi-precision multiplication of very large numbers. For example, elliptic curve systems perform multi-precision arithmetic on 128- to 256-bit numbers, while systems based on exponentiation may employ 1024- to 2048-bit numbers.
Many cryptographic applications use finite field arithmetic. For example, elliptic curve cryptography typically operates in the finite field GF(2m) that contains 2m elements, wherein m is a positive integer. The multiplication operation in finite-field applications can be particularly slow and inefficient. Several techniques have been proposed to perform fast arithmetic operations over GF(2m). One technique, for example, uses an optimized normal basis representation. See R. Mullin et al., Optimal Normal Bases in GF(pn), Discrete Applied Mathematics, Vol. 22, pp. 149-161 (1988). Although optimal normal basis multiplication is efficient in hardware, it is not efficient in software, and an optimal normal basis representation does not exist for all field sizes. Another technique involves embedding GF(2m) in a larger ring Rp where the arithmetic operations can be performed efficiently. See J. H. Silverman, Fast Multiplication in Finite Field GF(2N), Cryptographic Hardware and Embedded Systems, pp. 122-134 (1999). This method, however, works only when m+1 is a prime, and 2 is a primitive root modulo m+1. Another technique involves using a standard basis with coefficients in a subfield GF(2r). See E. De Win et al., A Fast Software Implementation for Arithmetic Operations in GF(2n), Advances in Cryptology—ASIACRYPT 96, pp. 65-76 (1996); J. Guajardo and C. Paar, Fast Efficient Algorithms for Elliptic Curve Cryptosystems, Advances in Cryptology—CRYPTO 97, pp. 342-356 (1997); and C. Paar and P. Soria-Rodriguez, Fast Arithmetic Architectures for Public-Key Algorithms Over Galois Fields GF((2n)m), Advances in Cryptology—EUROCRYPT 97, pp. 363-378 (1997). In this method, however, the field size m must be a multiple of r, and look-up tables are required to perform the calculations in GF(2r). Still another technique involves adapting Montgomery multiplication for the fields GF(2m). See C. Koc and T. Acar, Montgomery Multiplication in GF(2k), Designs, Codes and Cryptography, 14(1):57-69 (April 1998).
In order to improve the performance of these and other cryptographic systems, improved multi-precision multiplication methods and apparatus are needed.