The present invention is directed to extended greatest common divisor algorithms and in particular to an improved Jebelean greatest common divisor algorithm that efficiently calculates arithmetic inverses. (The extended greatest common divisor algorithm calculates 3 integers from two positive integers as follows: [xcex1,xcex2,d]=GCD(u,v) such that d|u, d|v, ∀e greater than d: (e|u and e|v) and xcex1xc2x7u+xcex2xc2x7v=d.)
Modular arithmetic is used in many applications such as encryption algorithms and the generation of error correcting codes. When the modulus is a prime number (p), the operations of addition, subtraction, multiplication and division with nonzero elements as well as the associative, commutative and distributive properties are defined over the set of numbers from zero to p. This set of numbers defines a Finite field modulo p, FP. They are often referred as xe2x80x9cPrime Fieldsxe2x80x9d.
Extended GCD algorithms are commonly used to find inverses in large Finite fields, which are of interest for encryption purposes. One type of encryption algorithm encrypts data using exponentiation over a large finite field, relying on the inherent difficulty of the inverse of exponentiation, the discrete logarithm problem, to hold the data secure. Encryption performed on a large finite field (having more elements) is more secure than encryption performed on a small field. One problem with using large finite fields, however, is the difficulty in performing even simple arithmetic operations on the large numbers in the field. Typical numbers used in data encryption have hundreds of bits. These numbers are too large to be easily handled by commonly available microprocessors that are limited to 32 or 64-bit arithmetic. This is especially true of exponentiation where a 100-bit number is raised to the power of second 100-bit number and the result is determined modulo a third 100-bit number. As described below, calculations using these large numbers are typically handled using multiprecision arithmetic.
Another type of encryption algorithm uses multiplication by an integer number within an elliptic curve group, where the group operation is symbolized by addition. (It is the same as exponentiation in groups, where the group operation is denoted by multiplication.) An elliptic curve group is defined on ordered pairs of points of a grid that lie on an elliptic curve defined by an equation such as equation (1).
Y2=(X3+Axc2x7X+B)modulo Pxe2x80x83xe2x80x83(1) 
where P is a prime number equal to the number of rows and the number of columns in the grid together with a special point ∘, called the point at infinity. In elliptic curve cryptography, an encryption key is generated by multiplying a generator point P by itself k times. (i.e. Q=kP, where Q is the encryption key).
Multiplication by an integer in the elliptic curve group is modeled as repeated addition of the group elements to themselves. Addition of a group element to itself in an elliptic curve group, however, is not as simple as integer addition. Because points in the elliptic curve group are ordered pairs, addition may be represented as, (X1,Y1)+(X2,Y2)=(X3,Y3) where X3, Y3 are defined by equations (2) and (3) if neither of the points is the point at infinity (in which case the definition states that (X1,Y1)+∘=(X1,Y1)). L a variable used in equations (2) and (3) is defined by equation (4).
X3=L2xe2x88x92X1xe2x88x92X2modulo Pxe2x80x83xe2x80x83(2) 
Y3=L(X1xe2x88x92X3)xe2x88x92Y1modulo Pxe2x80x83xe2x80x83(3) 
L=(Y2xe2x88x92Y1)/(X2xe2x88x92X1)modulo Pxe2x80x83xe2x80x83(4) 
If X1=X2 and Y1=Y2, X3 and Y3 are defined by equations (5) and (6).
X3=L2xe2x88x922X1modulo Pxe2x80x83xe2x80x83(5) 
Y3=L(X1xe2x88x92X3)xe2x88x92Y1modulo Pxe2x80x83xe2x80x83(6) 
xe2x80x83L=(3X12+A)/2Y1xe2x80x83xe2x80x83(7)
Where A is the coefficient of X in equation (1).
Thus, addition of two members of the elliptic curve group involves a modular integer division operation. In modular arithmetic, division of a value N by a value D means multiplication of N by the arithmetic inverse of D, Dxe2x88x921. It is known that an arithmetic inverse of a number in a finite field may be calculated using the extended greatest common divisor algorithm.
FIG. 1 is a flow chart diagram, which illustrates an extended greatest common divisor (GCD) algorithm known as Euclid""s or Euclidean algorithm. The algorithm shown in FIG. 1 calculates the greatest common divisor of U and V where U is greater than V. This algorithm relies on the property that if U and V have a common divisor D so does Uxe2x88x92V, Uxe2x88x922V and so on. Thus, using only subtraction, one can calculate the GCD of U and V. GCD algorithms run faster when they can combine multiple subtractions in a single step. In general, GCD algorithms operate by successively reducing the values of U and V, combining multiple subtraction operations where possible, while maintaining the equations (7), (8) and (9)
U1xc2x7U+U2xc2x7V=U3xe2x80x83xe2x80x83(7) 
V1xc2x7U+V2xc2x7V=V3xe2x80x83xe2x80x83(8) 
T1xc2x7U+T2xc2x7V=T3xe2x80x83xe2x80x83(9) 
Where Uxe2x89xa7V and (U1, U2, U3) and (V1, V2, V3) are initially assigned the values of (1, 0, U) and (0, 1, V), respectively. If the algorithm is used to calculate the greatest common divisor of a prime number P and a value X, then, upon termination, U3=GCD(P, X)=1 and U2=Xxe2x88x921 MOD P. In general terms, GCD algorithms operate by repetitively reducing the number of bits in the larger value, U, and switching the two values. Thus, the algorithm successively reduces the values of U3 and V3 while maintaining the equations. Because it also maintains the values U2 and V2, the algorithm shown in FIG. 1 not only calculates the greatest common divisor of U and V but also calculates the inverse of V modulo U (assuming U is a prime). That is to say, Vxe2x88x921 where Vxc2x7Vxe2x88x921=1 modulo U. Furthermore, it is noted that the variables U1, V1 and T1 do not need to be maintained because they can be determined from the other variables, for example, U1 can be determined from U2 and U3 by the identity U1=(U3xe2x88x92U2xc2x7V)/U. As described below, when U is a prime number, this inverse may be used for division operations performed on the Finite field FU.
The algorithm shown in FIG. 1 begins at step 110 by obtaining the values U and V. Next, at step 112 a temporary variable U3 is set equal to U and a temporary variable V3 is set equal to V. Also in step 112 the variable U2 is set to zero and V2 is set one. Next, at step 114 the variable V3 is tested to determine if it is equal to zero. When V3 is equal to zero, U3 contains the GCD of U and V and U2 contains the arithmetic inverse of V modulo U (a prime). At step 116, the values U3 and U2 are returned.
If V3 does not equal zero at step 114, step 118 is executed. This step calculates a value Q which is equal to the greatest integer less than or equal to U3/V3. Next, it determines a new value for U3 by storing the quantity, U3xe2x88x92Qxc2x7V3 into a temporary variable T3. Also at step 118, the process stores the value U2xe2x88x92Qxc2x7V2 into a temporary variable T2. This value is the new value for the variable U2. The variable T3 holds the new value of U3. Because U3 has been reduced but V3 has not, U3 is less than V3. Consequently, the values of U3 and V3 are switched and the values U2 and V2 are also switched by the instructions that set U3=V3, V3=T3, U2=V2 and V2=T2. After step 118, control returns to step 114 where V3 is once again compared to determine if it is equal to zero. When V3 is equal to zero the process returns the values U3 and U2. The value U3 is the greatest common divisor of U and V while the value U2 is the inverse of V modulo U (i.e. Vxe2x88x921 MOD U), when U is a prime.
Because the Euclidean algorithm calculates the greatest integer less than or equal to U3/V3, it is not efficient for very large numbers that are processed using multiprecision arithmetic. Other GCD algorithms exist which are much more efficient for large numbers. One such algorithm is Jebelean""s algorithm which is described in an article entitled xe2x80x9cA Generalization of the Binary GCD Algorithm,xe2x80x9d ACM Symposium on Symbolic and Algebraic Computation (ISSAC) July, 1993, pp 111-116. This algorithm gains efficiency by calculating factors XF and YF that zero the N low-order bits of U3 and V3 when U3 and V3 are multiplied by these factors. The bit-reduced value for U3 is then calculated as (XFxc2x7U3+YFxc2x7V3)/2N, where division by 2N is implemented as a right-shift by N bits. When U3 and V3 are multiplied by XF and YF, respectively, the number of bits in the new value of U3 increases but by less than N bits. Thus, the bit shift produces a net reduction in the number of bits. One problem with algorithms of this type is that they calculate GCD (Fxc2x7U3, V3), where F represents the extra factors, instead of GCD (U3, V3). Consequently, when V3 equals zero, the returned value of the GCD may be greater than 1 even though U and V are relatively prime. One solution to this problem is to call a separate GCD routine, such as the Euclidean GCD routine or the Lehmer-Euclid routine, to remove these extra factors, as the GCD(U, V)=GCD(U, GCD(V, GCDxe2x80x2)), where GCDxe2x80x2 is the GCD having the spurious factors. While this may seem to be an expensive step, the inventors have determined that these spurious factors are almost always less than eight-bits in size when the actual GCD is one. Consequently, even invoking the GCD routine recursively is relatively inexpensive.
A second problem with algorithms of this type concerns the calculation of the inverse. The Jebelean algorithm uses a multi-precision addition/subtraction and shift operation to maintain the value of V2 for each iteration in which V2 is odd. This greatly adds to the execution time of the algorithm. This problem requires a relatively expensive fix: to maintain equation (3) one would initially calculate T1U+T2V=T3/KX, (in the Jebelean algorithm, K is 22M) where it is relatively expensive to correct T2 when accounting for the KX term. To maintain equation (2) one must transform Uxe2x80x21U+Uxe2x80x22V=GCDxe2x80x2 into U1U+U2V=GCD where GCD | GCDxe2x80x2but A=(GCDxe2x80x2/GCD) need not divide Uxe2x80x21 or Uxe2x80x22. Thus, the calculation of the inverse is not a straightforward process using the Jebelean GCD algorithm.
Modular division may be used in other types of encryption algorithms than the elliptic curve algorithm. As described above, an exemplary, simplified encryption algorithm encrypts a message M by raising a large number, representing the message M, to a power E, and determines the result modulo P. In this example the values M, E and P are each hundreds of bits in length. It is noted that this is a very unsecure, poor encryption scheme, here it is used only for illustration. There are, however, many real encryption methods, which are more complex, that use exponentiation. Operations on numbers of this size are difficult because common computer processing hardware supports at most sixty-four bit mathematical operations. Values greater than 64 bits in length are typically calculated using multiprecision arithmetic in which the numbers are broken down into, for example, 64-bit sections and the sections are processed separately. A number a raised to an exponent e can always be calculated by multiplying that number by itself the number of time represented by the exponent, or in mathematical terms:
ae=axc2x7axc2x7a . . . e number of times. 
Another method, which is significantly faster, is the multiply chain algorithm. In this case, let e=enxe2x88x922enxe2x88x922 . . . e1e0 be an n-bit exponent ei∈{0,1}, 0xe2x89xa6ixe2x89xa6nxe2x88x921 and enxe2x88x921=1. The algorithm starts with p1=a, then
pi+1=pi2 if enxe2x88x921xe2x88x92i=0 or axc2x7pi2 if enxe2x88x921xe2x88x92i=1, where 1xe2x89xa6ixe2x89xa6nxe2x88x922. 
Several methods are known in the art to reduce the number of multiplications used to produce efficient exponentiation of the base value.
One method known to reduce the number of multiplication is the signed digit algorithm. Using this method, the exponent is represented as a string of bits comprising the values 0 and 1. Within the bit string, sequences (or xe2x80x9crunsxe2x80x9d) of 1""s are replaced by 0""s, with a 1 being placed in the next higher bit position to the most significant bit (MSB) position of the run, and xe2x80x9cxe2x88x921xe2x80x9d being inserted in the least significant bit (LSB) position of the run. The xe2x80x9cxe2x88x921xe2x80x9d means multiplication by axe2x88x921. Consequently, for this method to work well, it is desirable to be able to compute axe2x88x921 efficiently. By thus efficiently recoding the exponent bit string, the expected number of multiplication operations is reduced from n/2 to n/3.
Although the signed digit algorithm reduces the number of multiplication operations needed to calculated the exponential value ME modulo P, it replaces some of the axc2x7pi multiplication operations by axe2x88x921xc2x7pi2 operations (i.e. pi2/a). As described above, division may be implemented as the multiplication by the arithmetic inverse and the arithmetic inverse may be calculated using an extended GCD algorithm.
FIG. 2 is a flow chart diagram, which illustrates an exemplary method for calculating ME modulo P. The first step in this algorithm, step 210 gets the values P, M and E. P is a large prime number, M is the message to be encoded and E is an exponent value to be used to encode the message. The next step in the process is to determine the inverse of M modulo P (i.e. Mxe2x88x921 MOD P). This is accomplished by calculating the greatest common divisor of P and M at step 212. As P is a prime number by definition, the value U2 returned by the GCD algorithm is equal to Mxe2x88x921 MOD P. Next, at step 214, the signed digit algorithm is applied to the exponent E. Finally, at step 216, the multiply-square algorithm is applied to calculate ME modulo P. As described above, the multiply-square algorithm both multiplies and divides by M. The value Mxe2x88x921 MOD P is used to convert the division operation into a multiplication operation.
The Euclidean algorithm is not efficient for large numbers that require multiprecision arithmetic. The Jebelean algorithm, while efficient in calculating the greatest common divisors, is not efficient for calculating arithmetic inverses. Thus, there is a need for an efficient GCD algorithm that calculates inverses, and operates efficiently on numbers that require multiprecision arithmetic.
The present invention is embodied in a method for calculating greatest common divisors and arithmetic inverses using the Jebelean algorithm. The algorithm keeps track of the number of times that U3 and V3 have been divided by two in the process of calculating the greatest common divisor and correct the arithmetic inverse for these divisions in a single step, after the greatest common divisor has been calculated.
According to one aspect of the invention, the method modifies the equations to be U1xc2x7U+U2xc2x7V=2jxc2x7U3 and V1xc2x7U+V2xc2x7V=2ixc2x7V3 where i and j are the number of times that U3 and V3 have been divided by 2, respectively. The arithmetic inverse of V, U2, is corrected according to the equation Vxe2x88x921=U2xc2x7(2j)xe2x88x921.
According to another aspect of the invention, shifting of the binary values representing U3 and U2 is accomplished by changing the position of respective pointers to bit positions in the binary values.
According to yet another aspect of the invention, spurious factors are removed using a pre-calculated table of arithmetic inverse values. A value corresponding to a factor is selected and multiplied by the GCD value containing the spurious factor to remove the factor.