1. Field of the Invention
The present invention relates to mathematical algorithms for cryptographic applications and, in particular, to calculating exponentiations, as are, for example, used in the RSA crypto-algorithm.
2. Description of Prior Art
The RSA cryptosystem, which is named after its inventors Rivest, R., Shamir A. and Adleman, L., is one of the most frequently used public key cryptosystems. This method is described in section 8.2 of “Handbook of Applied Cryptography,” Menezes, von Oorschot, Vanstone, CRC Press, 1996. The RSA cryptosystem can be used to both perform encryptions and to execute digital signatures. Its security is based on the difficult feasibility of the integer-factorizing problem. For both an RSA encryption and an RSA decryption, a modular exponentiation of the following form must be performed:E=Bdmodulo N
Thus B is the base, d being the exponent and N the module.
In RSA encryption, the exponent d is part of the public key. In RSA decryption, however, the exponent d is part of the private key which has to be protected from spying.
It is the task of cryptography circuits to calculate this modular exponentiation securely on the one hand and quickly or efficiently, respectively, on the other hand. Cryptography circuits are frequently used in applications in which calculating and storage resources are limited. Thus, it is not possible to provide high storage or calculating resources on a smartcard which is, for example, used for identification purposes or in connection to money transactions.
The exponentiation is typically calculated by means of the so-called “square and multiply” algorithm, irrespective of whether it is modular or not. For this, reference is made to FIG. 3. At first, the exponentiation without a modular reduction is described. Then it will be explained how the algorithm in the residual class system of the module N can be put into practice.
It is the object to calculate the result E of the exponentiation Bd, as is explained in block 30 of FIG. 3. The exponent d is a binary exponent and consists of several bits ranging from a most significant bit (msb) to a least significant bit (lsb). At first, the numbers B, d are provided, as is illustrated in block 32 in FIG. 3. Then the resulting value E is initialized to a value of 1, as is illustrated by block 34.
Subsequently, the exponent d is examined or scanned, respectively, digit after digit, one digit of the exponent being referred to as di. If the digit or, for example, the bit of the exponent, respectively, currently examined has a value of 1, as is examined by a decision block 36, the left branch in FIG. 3 will be taken. If, however, the bit examined of the exponent has a value of 0, the right branch of FIG. 3 will be taken.
If it is determined by the decision block 36 that the bit examined of the exponent has a value of 1, square step 38 is at first performed, i.e. the current resulting value is squared. Then, in a multiply step 40, the base B is multiplied to the current value of the resulting value E, that is the result of step 38. In a further decision block 42, it is then examined whether there are further digits of the exponent. If this is the case, a return via a loop 46 is performed and it is examined whether the next digit di of the exponent comprises a 1 or a 0 (block 36). If the next digit examined equals 0, square step 38′ of the right branch of FIG. 3 is performed. In contrast to the left branch, however, no multiply operation, which would correspond to block 40 of the left branch of FIG. 3, is performed in the case that the digit examined of the exponent equals 0.
The procedure described above is repeated, departing from the most significant bit of the exponent d, until the least significant bit has been reached. After processing the least significant bit, block 42 will establish that there are no more di. The current value of the resulting value E is the overall result E of the exponentiation output in block 30.
In order to make the exponentiation described in FIG. 3 a modular exponentiation, the module N is input in block 32 in addition to the base B and the exponent d. Additionally, a modular reduction takes place in both branches (block 44 in the left branch and block 44′ in the right branch) so that, generally spoken, a modular reduction is performed after each multiplication such that the resulting value E at the end of processing for each digit of the exponent is in the residual class of the module N.
It is to be noted that the multiplication and the modular reduction do not necessarily have to be separated into two subsequent steps. In technology, combined multiplication look ahead and reduction look ahead methods allowing an efficient calculation of a multiplication are known. The so-called ZDN algorithm is to be pointed out here in particular.
The square and multiply algorithm shown in FIG. 3, in its most simple form, is problematic for two reasons.
First, when comparing the two branches, an operation is missing in the right branch, that is when a digit of the exponent equals 0. The two branches in FIG. 3 are asymmetrical in that a multiplication (block 40) will be executed if a digit of the exponent equals 1, while there is no corresponding operation in the right branch. This means that the square and multiply algorithm is thus attackable by so-called timing attacks and power analysis attacks. In order to bring about a homogenization for both time and current, i.e. time and power consumption of the circuit are constant irrespective of whether a 0 or a 1 is in the exponent, a dummy multiplication 40′ can be introduced in the right branch, wherein the result of the dummy multiplication, however, is not used but only the result of block 38′, i.e. of square step.
The dummy multiplication results in a time and current homogenization of both branches but requires calculating resources. The dummy multiplication thus leads to an increased security at the expense of the overall performance of the circuit.
A further disadvantage of the square and multiply algorithm described in FIG. 3 is the fact that this algorithm is not suitable for a parallel execution. When, for example, the left branch is considered, it is not possible to calculate blocks 38 and 40 in parallel since the calculations in block 40 depend on the calculations in block 38. Thus, a calculating unit first has to calculate block 38 and, when the result of the square operation is present, perform the calculations of block 40, i.e. the multiplication of the base to the result of block 38.