[Overview of Cryptographic Methods and Tamper Proof Technologies]
Cryptography is a technique of protecting information from tapping or falsification. Cryptographic methods that have spread recently can roughly be classified into two categories, public-key cryptography and symmetric-key cryptography. Public-key cryptography uses different keys respectively for encryption and decryption, and publishes the key for encryption (public key) to the public while handling, as secret information, the key used for decrypting the encrypted text (secret key) so that only the receiver can access the secret key in order to ensure security. By contrast, symmetric-key cryptography uses the same key (secret key) for encryption and decryption, and prevents a third person, i.e., a person who is neither the transmitter nor the receiver, from accessing the key in order to ensure security.
Among the techniques in the field of cryptography, there is a technique called tamper proof technology. Tamper proof technology is used for preventing secret information such as a secret key or the like from being leaked due to attacks made by malicious third parties. ISO15408, one of the international standards for security, requires that a smart card included in a device such as an embedded device having a cryptographic function be based on a tamper proof technology.
Tamper proof technology targets various attacking methods. A particularly important attacking method is a method called a fault attack, which was made public by Boneh, DeMillo, and Lipton in 1997. A fault attack applies various stresses (such as abnormal clocks, excessive voltage, high temperature or the like) to the circuit of the cryptographic processor in an embedded device for a smart card or the like so as to cause abnormal values in the internal data in the circuit in order to crack the key information in the cryptographic processor. Non-Patent Document 1 should be referred to for more detailed information on the above described fault attack.
It is known that the use of a fault attack enables the cracking of the secret key in a cryptographic processor regardless of whether the information is encrypted by public-key cryptography or symmetric-key cryptography. Particularly, a fault attack to public-key cryptography, which is referred to as RSA cryptography in Non-Patent Document 1, is considered to be a great threat because the implementation of this attack is easy for attackers. As another fault attack to public-key cryptography, a fault attack to elliptic curve cryptography (ECC) is also known (see Non-Patent Document 2 for detailed information). Countermeasure against these fault attacks to public-key cryptography is an important problem in the field of tamper proof technology.
[Public-Key Cryptography]
Hereinbelow, public-key cryptography will be explained as the conventional technique on which the present invention is based.
[a. Binary Method]
RSA and ECC are known as representative algorithms for public-key cryptography. In a processor employing RSA cryptography, a process called an modular exponent operation is executed. A modular exponent operation is a process in which calculations are performed in order to obtain a V that satisfies V=ad (mod n) where d represents an exponent, a represents a message, and n represents a public key. The binary method described below as algorithm 1 is known as an algorithm for executing this process effectively.
<Algorithm 1: Exponent Remainder Operation Using Binary Method>
Input
d: exponent
u: bit length of d
a: cardinal number
n: modulus
Output
v=ad (mod n): result of modular exponent operation
Algorithm
a101  v=a;a102  for(i=u−2;i>=0;i=i−1) {  /* extract top k-bit of d */a103    v = v × v (mod n);a104    if (di == 1) v = v × a (mod n);a105  }a106  return v;
In algorithm 1, v is initialized, i.e., v=a in step a101, the first step, and the u-bit value d=(du-1, du-2, . . . , d0)2 is searched for in the order of du-1, du-2, . . . d0. The expression ( )2 represents a binary number, for example, (101)2=5, (1101)2=13. Also, du-1=1 is satisfied.
This search is repeated in the “for loop” over steps a102 through a105. In the for loop, the calculation process of v=v×v (mod n) is executed in step a103, and processes are switched according to bit value di in step a104. When di=0, no process is performed, and when di=1, the operation of v=v×a (mod n) is performed. In other words, when di=0, a squaring is executed, and when di=1, a squaring and multiplication are executed. This is repeated for each of du-1, du-2, . . . , d0 in order to perform an modular exponent operation process. In step a106, v is output as a calculation result.
FIG. 1 schematically illustrates the flow of the process in algorithm 1.
A binary method is used not only in RSA cryptography, but also in a process called a point scalar multiplication in ECC. Point scalar multiplication is a process of calculating a point V that satisfies V=dA where d represents a scalar value, and A represents point A on an elliptic curve. Algorithm 2 is described below as an algorithm for performing point scalar multiplication V=dA using a binary method.
<Algorithm 2: Scalar Multiplication Using Binary Method>
Input
d: exponent
u: bit number of d
A: point on elliptic curve
Output
V=dA: d multiplication of A
Algorithm
a201  V=A;a202  for(i=u−2;i>=0;i=i−1) {a203    V=2V;a204    if (di==1) V=V+A;a205  }a206  return V;
In algorithm 2, similarly to algorithm 1, V is initialized, i.e., V=A in step s201, and in step a202, a u-bit value d=(du-1, du-2, . . . d0)2 is searched for in the order of du-1, du-2, . . . , d0 in the for loop when du-1=1. In the for loop, the calculation process of V=2V is performed in step a203, and then the process proceeds to step a204 where no process is executed when the bit value di=0 (only point doubling is executed) and the process of V=V+A is executed when di=1 (point doubling and point addition are executed). This is repeated for each of du-1, du-2, . . . , d0 so as to execute a process of point scalar multiplication. In a206, the last step, V is output as a calculation result.
[b. Montgomery Multiplication Remainder]
As another algorithm for processing the above algorithm effectively, a method named the Montgomery multiplication remainder (operation) is known (see Non-Patent Document 3 for more detailed information).
The Montgomery multiplication remainder is an algorithm for performing, at high speed, a multiplication remainder operation expressed by s×t (mod N) where s, t, and n represent integers. The use of the Montgomery multiplication remainder enables the calculation of REDC(A, B, N)=s×t×r−1 (mod n) where r=2f is satisfied, s and t represent f-bit integers, and n represents an f-bit odd number (0≦s×t≦r×n). In the above expression, r−1(mod n) represents the inverse of r of the modulus n, and satisfies r×r−1=1 (mod n).
The use of this algorithm enables the execution, at high speed, of the processes of a squaring (v=v×v (mod n)) multiplication (v=v×a (mod n)), point doubling (V=2V), and point addition (V=V+A) in algorithms 1 and 2 above so that the entire calculation of algorithms 1 and 2 can be executed effectively.
A basic algorithm of the Montgomery multiplication remainder is described below as algorithm 3.
<Algorithm 3: Basic Algorithm of Montgomery Multiplication Remainder>
Input
s: f-bit integer
t: f-bit integer
n: modulus of remainder, f-bit odd value
Output
v=s×t×r−1(mod n): result of Montgomery multiplication remainder (r=2f)
Algorithm
a301  n′=(−n−1) (mod 2f);a302  h=s×t×n′ (mod 2f);a303  u=((s×t)+(h×n))/2f; /* the lower f bits of (s×t)+(h×n)are all zero */a304  if (u≧n) u=u−n;a305  return u;
To this algorithm 3, s, t, and n are input to calculate an n′ that satisfies n′=−n−1 (mod 2f) in step a301, and an h that satisfies h=s×t×n′(mod 2f) is calculated in step a302. In step a303, the above h, s, t, and n are used to calculate u=((s×t)+(h×n))/2f. It is mathematically proven that (s×t)+(h×n) in that calculation is always a multiple of 2f (in other words, the lower f bits are all zero), and therefore it is assured that the result u of the division in step a303 is always an integer (see the calculation example presented below). In step a304, u and n are compared in magnitude, and the calculation of u=u−n is performed when u is equal to or greater than n. In step a305, u is output as a calculation result.
Below is a calculation example of algorithm 3. Note that numbers starting with “0x” are values expressed in hexadecimal.
When s = 0x1234, t = 0x5678, n = 0xffff, and f =16a301  n′=(−n−1) (mod 216) = 0x0001a302  h=s×t×n′ (mod 216) = (0x1234)×(0x5678)×(0x0001) (mod216) = 0x0060a303  u=((s×t)+(h×n))/216 = (0x06260060 + 0x005fffa0)/216 =0x06860000/216 = 0x0686 /* the lower f bits of (s×t)+(h×n) ina303  are all zero*/a304  as the condition u≧n is not met, calculation u = u−nis not performeda305  return 0x0686
In the Montgomery modular multiplication in algorithm 3, an f-bit variable is used for the multiplication. However, in calculations for cryptography, the value of f usually tends to be so great as to exceed the bit width that can be calculated using commonly used multipliers. For example, the maximum bit width that can be calculated by commonly used multipliers is 64 bits×64 bits while f is 1024 in RSA cryptography and f is 160 in ECC. Accordingly, algorithm 3 above cannot be executed directly by commonly used hardware, which is problematic.
In order to solve this problem, an algorithm for making algorithm 3 be executable on a multiplier of w-bit by w-bit (w<f) is used. There are various types of implementation algorithms for the Montgomery modular multiplication, and an example thereof is presented below as algorithm 4.
<Algorithm 4: Example of Implementation Algorithm of Montgomery Modular Multiplication>
Input
s: f-bit integer
t: f-bit integer
r: r=2f 
n: modulus of multiplication remainder, f-bit odd value
Output
REDC(s,t)=s×t×r−1(mod n)
Multi-Precision Integers
s=(sg-1, . . . , s0)
t=(tg-1, . . . , t0)
n=(ng-1, . . . , n0)
y=(yg, . . . , y0)
n′=(n′g-1, . . . , n′0)
W-Bit Word Data, where r=2w.
tmp, c1, c2, h′
Algorithm
a401 y = 0;a402 for (j=0;j<=g−1;j=j+1) {a403  (c1, tmp) = y0 + s0 × tj;a404  h′ = tmp × n′0 (mod 2w);a405  (c2, yNULL) = tmp + h′×n0 /* yNULL is always zero */a406  for (i=1;i<=g−1;i=i+1) {a407   (c1, tmp) = yi + c1 + si × tj;a408   (c2, yi−1) = tmp + c2 + h′ × ni;a409  }a410  (c2, c1) = C1+C2 +yg;a411  yg−1 = C1;a412  yg = C2;a413 }a414 if (y>=n) y=y−n;a415 return y;
In algorithm 3, the data of an f-bit integer is expressed using a plurality of w-bit variables. For example, an f-bit variable s is expressed using g w-bit variables si (i=0, 1 . . . g−1) (the smaller i is, the lower the bit of the value is) such as in s=(sg-1, sg-2, . . . , s0). However, w×g≧f is satisfied. Integer data expressed using plural w-bit variables such as above is called a multiple-precision integer. For a multiple-precision integer, data is expressed on a w-bit basis, and the data value of w bits is referred to as “a 1 word”.
As algorithm 4, an example of an implementation algorithm is illustrated as an example that enables the calculation of algorithm 3 by dividing data in units of 1 word and combining the multiplication of w-bit×w-bit.
In order to calculate the sum of the results of the multiplication of f-bit×f-bit, i.e., (s×t)+(h×n), algorithm 4 consists of: (1) the loop of a402 through a413 (j=0, 1, . . . , g−1) for calculating the sum of the multiplication of f-bit×w-bit, i.e., (s×tj)+(n×hj); and (2) the loop of a406 through 409 for performing the multiplication of this f-bit×w-bit by accumulating the results of the multiplication of w-bit×w-bit.
The lower f bits of (s×t)+(h×n) calculated in a303 in algorithm 3 are all zero, and therefore, it is mathematically assured that the lower w bits of (s×tj)+(n×hj) calculated in algorithm 4 are all zero too. In algorithm 4, a value obtained by dividing (s×tj)+(n×hj) by 2′ is stored in y=(yg, . . . , yi, y0) (the division by 2′ is performed by the shift in steps a407 and a408), and the values of the lower w bits that are always all zero are stored in YNULL in step a405.
[c. RSA Cryptography with CRT]
An algorithm called Chinese Remainder Theorem (CRT) is known as an operation method for performing, at high speed, a decryption process in RSA cryptography. CRT algorithm is a method for accelerating a decryption process m=cd (mod n) in RSA. The use of a CRT algorithm can accelerate a decryption process in RSA by four times, and thus it is used very often.
A decryption process in RSA using a CRT algorithm is described below as algorithm 5.
<Algorithm 5: Decryption Process in RSA Using a CRT Algorithm>
input: c, dp(=d mod(p−1)) dq(=d mod(q−1)), p, q, u(=p−1(mod q))
output: m (=cd mod n)
Algorithm
a501  cp = c (mod p) , cq = c mod qa502  mp = cpdp (mod p)a503  mq = cqdq (mod q)a504  m = ((mq−mp) × u (mod q) ) × p + mpa505  return m
The reason that a CRT algorithm can accelerate the process will be explained. In RSA cryptography, a modulus n is expressed as n=p×q where p and q are prime numbers, while in a CRT algorithm, the modular exponent operations are performed on moduli p and q instead of a modulus n. The respective bit lengths of p and q are each half the bit length of n, and reduction of the bit length to half reduces the calculation amount of one exponent remainder operation to ⅛ (in an exponent remainder operation, a cube of a bit length is calculated, and the calculation amount is reduced to ⅛, as (½)3=⅛). An exponent remainder operation in which the bit length is reduced to half has to be executed two times, i.e., one time each for a modulus p and modulus q. Therefore, the total amount is 2×(⅛)=¼, which means four times faster.
[d. Fault Attack]
A fault attack is, as was already described, a method made public by Boneh, DeMillo, and Lipton in 1997, by which various stresses (such as abnormal clocks, excessive voltage, high temperature or the like) are applied to the circuit of the cryptographic processor in an embedded device for a smart card or the like so that the internal data in the circuit involves abnormal values in order to decipher the key information in the cryptographic processor.
FIG. 2 illustrates widely distributed procedures for fault attacks.
First in step S201, an attacker inputs data into a smart card to start the cryptographic process. In step S202, the attacker applies stress to the circuit of the cryptographic processor that is executing the cryptographic process in order to make an abnormal value occur in the internal data (occurrence of fault). In step S203, the attacker obtains the abnormal data value output as the result of the cryptographic process. The abnormal value thus obtained is used for an analysis calculation so that the attacker can decipher the value of the secret key.
Fault attack is effective on all types of smart cards that have a public-key cryptography function. Particularly, a fault attack on a CRT algorithm is known as a great threat because it is easy for attackers to implement this attack.
A procedure for making an attack using a CRT algorithm is illustrated below as Algorithm 6. Note that GCD (a, b) represents the greatest common divider of a and b. For example, GCD (10, 25)=5.
<Algorithm 6: Fault Attack Using CRT Algorithm>
a601: attacker gives an encrypted text c to a smart card to execute a CRT algorithm process without making a fault occur, so that he or she can obtain a normal decryption result m. It is to be noted that dp, dq, p, q, u are not values input from the external environment but are secret keys stored in the card.a602: attacker gives an encrypted text c to the smart card to execute a CRT algorithm to make a fault occur in order to obtain an abnormal decryption result m′. At this moment, it is obvious that m′≠m.a603: attacker calculates GCD (m−m′, n) to obtain a prime number p=GCD(m−m′, n). It is known that all the secret keys dp, dq, p, q, u can be easily calculated from this prime number p. Thus, the attacker can decipher all the secret keys.
The occurrence of a fault in a602 will be illustrated in FIG. 3 in detail. In this explanation, the step numbers used in algorithm 5 above are used.
First, in the process in step a503, the attacker makes a fault occur so as to cause an abnormal value as a result of an exponent remainder operation of the modulus q. In other words, an abnormal value mq′≠cqdq(mod q) is obtained instead of a normal value mq=cqdq (mod q).
As the calculation result in step a503 is an abnormal value, the value calculated in step S504 is also an abnormal value m′≠cd (mod n). Thereby, this abnormal value m′ is output in step a505.
In a fault attack as widely practiced, only limited data can make a fault occur (for example, the third bit value in the data has to be inverted, etc.), and also a fault has to be made to occur 50 through 200 times repeatedly. By contrast, in a fault attack on a CRT algorithm, there is no limitation on data in making a fault occur in step a503, and any bit value may be inverted as long as m′q≠mq is satisfied. Further, a single success in making a fault occur enables the deciphering of secret keys.
Thus, it is known that a fault attack on a CRT algorithm is easy for attackers to implement, and it is a great threat to smart card.
[e. Conventional Countermeasure Against Fault Attack]
A method called fault detection is known as a countermeasure against the above described fault attack. This is a method in which whether or not a calculation result of a CRT algorithm is an abnormal value is detected in a smart card. In this method, data as a calculation result is not output when it is detected in a smart card that the calculation result of a CRT algorithm is an abnormal value m′. Thereby, the implementation of a procedure for attacking in step a602 becomes impossible and thus fault attacks can be prevented.
Doubling of a CRT algorithm is known as a common method of performing fault detection. The outline of an algorithm for performing this method is described below as algorithm 7.
<Algorithm 7: Fault Detection by Doubling of CRT Algorithm>
a701: calculate m1 from c, dp, dq, p, q, u by CRT algorithm
a702: calculate m2 from c, dp, dq, p, q, u by CRT algorithm
a703: when m1≠m2, fault is determined to have occurred, and when m1=m2, fault is determined to have not occurred.
As described above, a calculation by a CRT algorithm is performed twice, and a fault is determined to have occurred when the results of the calculations are different from each other, while a fault is determined to have not occurred when the results are identical.
[Problem with Conventional Countermeasure Against Fault Attack]
The fault detection method described as algorithm 7 executes the detection of the occurrence of a fault only in step a703. In other words, a fault cannot be detected at the moment when it occurs, and the fault remains undetected until a703, the last step.
This means that in a smart card that cannot detect the occurrence of a fault because it uses algorithm 7, a particular procedure can lead to success in a fault attack, which is problematic. FIG. 4 illustrates this problem in the fault detection method in algorithm 7. Phase numbers such as 1, 2, and 3 used in the RSA computations in FIG. 4 correspond to steps a501, a502, a503, and a504 in algorithm 5, respectively.
As has already been explained, the detection method in algorithm 7 executes the computations of RSA twice, and compares the computation results m1 and m2 in order to determine whether or not a fault has occurred. This detection method is capable of detecting the occurrence of a fault when the attack makes a fault occur only once. However, it is not capable of detecting a fault attack that makes a fault occur twice under the same conditions. This is because m1′=m2′ is satisfied despite the fact that m1 and m2 are both abnormal values, and accordingly the occurrence of a fault cannot be detected in the latter case while either m1 or m2 is an abnormal value (m1′ or m2′) to satisfy m1′≠m2 or m1≠m2′ in the former case.
Therefore, the conventional fault detection method using the doubling of a CRT algorithm involves a problem wherein it fails to detect a fault at the moment the fault has occurred.
Those who have a technique of attacking can easily make the same fault occur plural times, and thus this conventional method is greatly problematic in view of security.
Non-Patent Document 1:
D. Boneh, R. A. DeMillo, and R. J. Lipton., “On the importance of checking cryptographic protocols for faults,” In W. Fumy, editor, Advances in Cryptology—EUROCRYPT '97, volume 1233 of Lecture Notes in Computer Science, pages 37-51, Springer-Verlag, 1997.
Non-Patent Document 2:
Ingrid Biehl, Bernd Meyer and Volker Muller, “Differential Fault Attacks on Elliptic Curve Cryptosystems,” Advances in Cryptology-CRYPT 2000, volume 1880 of Lecture Notes in Computer Science, pages 131-146, Springer-Verlag, 2000.
Non-Patent Document 3:
S. R. Dusse and B. S. Kaliski Jr., “A Cryptographic Library for the Motorola DSP56000,” Advances in Cryptology—EUROCRYPTO '90 (LNCS 473), pages 230-244, 1990.
Non-Patent Document 4:
Mukaida, Takenaka, Masui, Torii, “Designing of High-speed Montgomery Multiply-accumulation remainder circuit” issued in symposium on cryptography and information security, SCIS2004, 2A3-2