The basic arithmetic operations (i.e., addition, multiplication, and inversion) in prime and binary extension fields, GF(p) and GF(2m), respectively, have numerous applications in cryptography. For example, RSA-based cryptography, Diffie-Hellman key exchange, elliptic curve cryptography, and the Digital Signature Standard (including the Elliptic Curve Digital Signature Algorithm) all use arithmetic operations in the finite field. These applications are described in, for example, W. Diffie and M. E. Hellman, “New Directions in Cryptography,” IEEE Trans. on Information Theory, 22:644-654 (1976); N. Koblitz, “Elliptic Curve Cryptosystems,” Mathematics of Computation, 48:203-209 (1987); A. J. Menezes, Elliptic Curve Public Key Cryptosystems, Kluwer Academic Publishers, Boston, Mass. (1993); J. J. Quisquater and C. Couvreur, “Fast Decipherment Algorithm for RSA Public-key Cryptosystem,” Electronics Letters 18:905-907 (1982); and Digital Signature Standard (DSS), National Institute of Standards and Technology, FIPS PUB 186-2, January 2000. For most applications, implementation of the field multiplication operation is a significant design issue because field multiplication generally requires complex and expensive hardware or software.
The Montgomery multiplication algorithm described in, for example, P. L. Montgomery, “Modular Multiplication Without Trial Division,” Mathematics of Computation, 44:519-521 (1985), is an efficient method for modular multiplication with an odd modulus and is useful in fast software implementations of the multiplication operation in prime fields GF(p). The Montgomery multiplication algorithm substitutes simple bit-shift operations for the more complex division operations used in other methods of determining modular products. These bit-shift operations are readily implemented with general-purpose computers.
Montgomery multiplication has also been used to perform multiplication in the finite field GF(2m) as described in  K. Koç and T. Acar, “Montgomery Multiplication in GF(2k),” Designs, Codes and Cryptography, 14:57-69 (1998). Efficient software implementations of Montgomery multiplication in GF(2m) are possible, particularly if an irreducible polynomial generating the finite field is chosen arbitrarily.
Several variants of the Montgomery multiplication algorithm have been suggested for efficient software implementations with specific processors. See, for example, H. Orup, “Simplifying Quotient Determination in High-radix Modular Multiplication,” in S. Knowles and W. H. McAllister, eds., Proceedings, 12th Symposium on Computer Arithmetic, p. 193-199, Bath, England, Jul. 19-21, 1995;  K. Koç, T. Acar, and B. S. Kaliski Jr, “Analyzing and Comparing Montgomery Multiplication Algorithms,” IEEE Micro 16:26-33 (1996).
Improved hardware implementations of Montgomery multiplication for limited precision operands have also been disclosed. See, for example, A. Bernal and A. Guyot, “Design of a Modular Multiplier Based on Montgomery's Algorithm,” in 13th Conference on Design of Circuits and Integrated Systems, p. 680-685, Madrid, Spain, Nov. 17-20, 1998. Implementations using high-radix modular multipliers have also been suggested. See, for example, P. Kornerup, “High-radix Modular Multiplication for Cryptosystems,” in E. Swartzlander, Jr. et al., eds., Proceedings, 11th Symposium on Computer Arithmetic, p. 277-283, Windsor, Ontario, Jun. 29-Jul. 2, 1993; and A. Royo et al., “Design and Implementation of a Coprocessor for Cryptography Applications,” in European Design and Test Conference, p. 213-217, Paris, France, Mar. 17-20, 1997. Because high-radix Montgomery multiplication implementations introduce long critical paths and complex circuitry, such designs are generally unattractive for hardware implementations of Montgomery multiplication.
Scalable Montgomery multiplier designs for the finite field GF(p) are disclosed in  K. Koç and A. F. Tenca, U.S. patent application Ser. No. 09/621,020, filed Jun. 21, 2000, and A. F. Tenca and  K. Koç, “A Scalable Architecture for Montgomery Multiplication,” in  K. Koç and C. Paar, eds., Cryptographic Hardware and Embedded Systems, Lecture Notes in Computer Science 1717:94-108, Springer Verlag, Berlin, Germany (1999), both of which are incorporated herein by reference. These scalable multipliers permit a fixed-area modular multiplication circuit (i.e., a circuit having a fixed precision) to be adapted to perform multiplication of operands of arbitrary precision.
Because of the importance of finite field multiplication in cryptographic systems, improved methods and apparatus for finite field multiplication are needed.