1. Field
The embodiments relate to cryptography, and in particular to cryptographic devices and processes using incremental modular multiplication with modular reduction without use of an integer multiplier.
2. Description of the Related Art
The Rivest Shamir & Adelman (RSA) algorithm for public key encryption is associated with significant processing cost at session establishment time due to the fact that it involves time consuming modular exponentiation operations. Modular exponentiation is the process of deriving the remainder from the division of a power of the input with a specified divisor. Modular exponentiation is time consuming in RSA implementations because the input, the power and the divisor are large numbers (i.e., they are expressed using many bits). For example, the input, the divisor and the power can be 512 bits long. To accelerate the calculation of modular exponents, RSA implementations deduce the calculation of modular exponents to the calculation of modular products and modular squares.
The RSA algorithm involves the calculation of a modular exponent in both the encryption and decryption processes. For example, on the decrypt side a plaintext P is derived from a ciphertext C as:P=Cd mod N
The divisor N is the product of two prime numbers p and q and the decryption exponent d is the multiplicative inverse of the encryption exponent e mod (p−1)(q−1). Using the Chinese remainder theorem (see, e.g., Wagon, S. “The Chinese Remainder Theorem.” §8.4 in Mathematica in Action. New York: W. H. Freeman, pp. 260-263, 1991) one can show that the decryption process can be deduced to the calculation of two smaller modular exponents:P=(q−1modp)·(Cdpmodp−Cdqmodq)modp·q+Cdqmodq                 where:dp=e−1mod(p−1)        anddq=e−1mod(q−1)        
The calculation of each of the two modular exponents on the decrypt side and of the modular exponent on the encrypt side can be deduced to the calculation of a number of modular products and modular squares, using the ‘square-and-multiply’ technique. Suppose that d=[dk dk−1 . . . d1].
To calculate a modular product or a modular square, most RSA implementations use the popular Montgomery algorithm (P. L. Montgomery, Modular Multiplication Without Trial Division, Math. Computation, 44: 519-521, 1985). The Montgomery algorithm is slow, however, because it visits every bit of its input twice and performs 3-4 long operations (i.e., input-wide operations) for every bit of the input. Further, the Montgomery algorithm is also slow because it creates mathematical structure for deriving the remainder easily. The Montgomery algorithm adds the divisor into the input product as many times needed in order for the least half of its input to be zero. In this way the final remainder can be computed after two passes on the input are complete.
The Montgomery algorithm accepts as input two numbers X and Y each of length k in bits and a divisor N and returns the number Z=X·Y·2−k mod N. In order for the algorithm to work, the numbers N and 2k must be relatively prime. For the derivation of the modular product W=X·Y mod N two Montgomery passes are needed: one for calculating the intermediate number Z=X·Y·2−k mod N and one for calculating the final product W as W=Z·22k·2−k mod N.
For modular reduction, many cryptographic processes uses Barrett's algorithm (P. D. Barrett. “Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor” Advances in Cryptology, Proceedings of Crypto '86, LNCS 263, A. M. Odlyzko, Ed. Springer-Verlag, 1987, pp. 311-323). Modular exponentiation involves repeatedly performing the modular reduction operation, which is a very costly operation as it requires integer multiplication.