The present invention relates in general to high-speed cryptographic processing systems and, more specifically, to a method and system for performing modulo arithmetic using a full-adder post processor implemented, for example, in a security coprocessor integrated circuit coupled to a host or network processor.
Modular arithmetic is a type of mathematics that has application in cryptography. In modular arithmetic, the operation A mod N is equal to the residual, or remainder, of A divided by N, such that the residual is between 0 and N-1. Thus, as an example, 16 mod 5 is equivalent to (3×5)+1, so the residual is 1. The foregoing operation is known as modular or modulo reduction.
Modular arithmetic has similarities to regular arithmetic. For example, there is modular addition:(7+4) mod 5=11 mod 5=1=1 mod 5There is also modular multiplication:(7×6) mod 5=42 mod 5=2=2 mod 5Other mathematical functions such as modular subtraction, (A−B) mod N and modular exponentiation, AB mod N can be defined.
Modular arithmetic has important uses in the field of cryptography. As the increased use of the Internet and fiber-optic based networks increases the communications flow of confidential information, the need to secure such communications increases. One popular encryption and decryption scheme is the Rivest-Shamir-Adleman (RSA) algorithm, which is used in public key cryptography systems and requires the use of modular arithmetic.
One drawback to the RSA algorithm and other encryption algorithms is that the processing time needed to encrypt or decrypt a message is significant, especially when the algorithms are used with larger keys. Thus, significant demands are placed on a host system's central processing unit. For example, the capacity of a web server handling thousands of on-line secured commercial transactions using a public key approach may be limited by the server's ability to perform modular arithmetic. One way to increase the speed of such algorithms would be to increase the speed of the modular arithmetic used in the algorithm, such as modular exponentiation, through hardware acceleration. Such hardware would desirably include a security coprocessor, coupled to a host or network processor, for handling modular arithmetic.
The modular exponentiation mathematics of the RSA algorithm can be more efficiently computed in a hardware multiplier using the known Montgomery's method for modular reduction. Montgomery's method implements the modular exponentiation (AE mod N) required in the RSA algorithm by using modular multiplication (AB mod N). When doing modular multiplication in Montgomery's method, it is necessary to perform the modulo reduction A mod N and the modulo addition (A+B) mod N, where the modulus N has a typical length of 512 or 1,024 bits. Also, prior to performing Montgomery multiplication, it is necessary to calculate the value of Ar2(n+8) mod N (where r>N and n is the size in bits of the value N).
Prior modular cryptographic systems typically use a 32-by-32 bit multiplier followed by division using the well-known restoring division or non-restoring division algorithms to compute a final result. However, computation using a 32-by-32 bit multiplier can require millions of clock cycles when handling larger RSA keys (e.g., 1,024-bit keys). It would be desirable to have an improved modular cryptographic system that can handle larger key sizes at high speeds.
In light of the foregoing, it would be advantageous to have an improved modular exponentiation and multiplication system that achieves high performance, low cost, and low power for implementation in an integrated circuit. Thus, there is a need for an improved post processor that does high-speed modulo reduction and addition in such a system. There is a further need for such a processor that can be provided as a high-performance security coprocessor for use with host or network processors.