The cryptosystem is roughly divided into a common key cryptosystem and a public key cryptosystem. In the system designated as the common key cryptosystem, the same key (secret key) is used for encryption and decryption, and the security is retained by keeping the secret key as information unknown to a third party other than a transmitter and a receiver. In the public key cryptosystem, different keys are used for encryption and decryption, and the security is retained by keeping a key (private key) used for decryption of a ciphertext as secret information of a receiver alone while a key (public key) used for encryption is open to the public.
One of techniques of the field of cryptography is decryption technique. The decryption technique is a technique of guessing secret information such as a secret key on the basis of available information such as a ciphertext, and there are various methods for the decryption technique. One method in the spotlight recently is designated as power analysis attack (hereinafter referred to as “PA”). The PA is a method developed by Paul Kocher in 1998, in which power consumption data obtained by providing various input data to an encryption device included in a smartcard or the like is collected and analyzed so as to guess key information stored in the encryption device. It is known that a secret key of both the common key cryptosystem and the public key cryptosystem may be guessed from an encryption device by employing the PA.
There are two kinds of PA, that is, single power analysis (hereinafter referred to as “SPA”) and differential power analysis (hereinafter referred to as “DPA”). The SPA is a method for guessing a secret key on the basis of the feature of single power consumption data of an encryption device, and the DPA is a method for guessing a secret key by analyzing differences among a large number of pieces of power consumption data.
At this point, an RSA cryptosystem will be described. The RSA cryptosystem security is based on difficulty of prime factorization. Although it is easy to calculate a composite number n=p×q on the basis of two prime numbers p and q of 1024 bits each, it is difficult to obtain the prime factors p and q on the basis of the composite number n alone (i.e., prime factorization is difficult), which is the premise of the security of the RSA cryptosystem. The RSA cryptosystem has two functions of encryption and decryption. Two kinds of decryptions are known: one is decryption not using Chinese remainder theorem (hereinafter referred to as “CRT”) (i.e., decryption without the CRT) and the other is decryption using the CRT (i.e., decryption with the CRT). The encryption, the decryption without the CRT and the decryption with the CRT are respectively illustrated in FIGS. 13, 14 and 15.
The encryption process and the decryption process without the CRT respectively illustrated in FIGS. 13 and 14 are very simple. In the encryption process, a ciphertext c is output through modular exponentiation of c:=me (mod n) modulo a composite number n wherein the base is a plaintext m and the exponent is a public key e. In the decryption without the CRT, a plaintext m is output through modular exponentiation of m:=cd(mod n) modulo a composite number n wherein the base is a ciphertext c and the exponent is a private key d. Incidentally, the private key d has a value satisfying a relationship with the public key e of e×d=1(mod(p−1)(q−1)). With respect to the calculation of modular exponentiation, a plurality of calculation algorithms are known including a binary method and a window method, and resistance to the SPA or the DPA depends upon the algorithm to be employed. The decryption with the CRT is a rapid version algorithm attained by reducing the amount of computation of the decryption without the CRT. In general, the amount of computation of the modular exponentiation is in proportion to (bit length of exponent)×(bit length of modulus)×(bit length of modulus). For example, with respect to the RSA cryptosystem wherein each of prime factors p and q is a 1024-bit value and a composite number n is a 2048-bit value, the bit length of the private key d is 2048 bits. This is because e×d=1(mod (p−1)(q−1)), namely, d=e−1(mod p−1)(q−1)), and the private key d has a value satisfying 0<d<(p−1)×(q−1), and therefore, the bit length of the private key d is equal to (p−1)(q−1), namely, 2048 bits. In this case, the necessary amount of computation of the modular exponentiation is 2048×2048×2048=8589934592. In general, the bit length of an exponent is substantially the same as the bit length of a modulus in the RSA decryption without the CRT. In other words, the amount of computation of the RSA decryption without the CRT is in proportion to the third power of the bit length of a modulus.
On the contrary, the decryption with the CRT illustrated in FIG. 15 is known to have the amount of computation reduced to ¼ of that of the decryption without the CRT. The decryption with the CRT includes the following three stages of CRT-1, CRT-2 and CRT-3:
CRT-1: Modular of a ciphertext c modulo p or q (steps 301 and 302 of FIG. 15)
CRT-2: Modular exponentiation modulo p or q (steps 303 and 304 of FIG. 15)
CRT-3: Calculation of a result of modular exponentiation modulo n based on the result of the modular exponentiation modulo p and q (CRT composition) (step 305 of FIG. 15)
The most part (95% or more) of the decryption with the CRT corresponds to the modular exponentiation of the stage CRT-2, which is modular exponentiation modulo a prime number p or q wherein a base is cp=c(mod p) or cq=c(mod q) and an exponent is a private key dp=d(mod(p−1)) or dq=d (mod(q−1)). The bit length of the modulus p or q is a half of that of the composite number n, namely, 1024 bits, and the bit length of the exponent dp or dq is also a half of that of the private key d, namely, 1024 bits. Accordingly, the amount of computation of the modular exponentiation to be performed at step 303 or 304 is 1024×1024×1024=1073741824, which is ⅛ of the amount of computation of the modular exponentiation for the bit length of 2048 bits. Since the processing with the ⅛ amount of computation is repeated twice, the amount of computation of the decryption with the CRT is ⅛×2=¼ of the amount of computation attained without the CRT.
When the decryption with the CRT is employed, the amount of computation one fourth of that attained by the decryption without the CRT, namely, an operation speed four times as high as that attained by the decryption without the CRT, may be realized. On the other hand, the decryption with the CRT has a disadvantage that it includes a large number of operations using the prime number p or q as illustrated in FIG. 15. Since the security of the RSA cryptosystem is based on the difficulty of the prime factorization of n=p×q, the RSA cryptosystem loses the security if the value of the prime number p or q is revealed to an attacker. Since the power consumption tends to be correlated with the prime number p or q in such operation processing using the prime number p or q, there is a problem that the prime number p or q is easily revealed through the PA.
The PA is known as means for an attacker to attack an encryption device implementing the decryption without the CRT of FIG. 14 or the decryption with the CRT of FIG. 15, that is, processing using a private key, for obtaining a private key d, dp, dq, p, or q. Now, conventionally known SPA or DPA attack against the decryption of FIG. 14 or 15 will be described.
(Power Analysis Attack)
(Outline of SPA)
At this point, the outline of the SPA will be described. The SPA is an attack made for guessing a private key used in an encryption device by using information obtained through observation of a single power waveform. This is an effective attack against an encryption device in which there is correlation between the content of the encryption and the shape of a power consumption waveform.
(Power Analysis Attack 1 using SPA (targeting decryption with CRT): Attack 1)
Now, power analysis attack using the SPA targeting the decryption with the CRT (hereinafter referred to as the attack 1) will be described.
SPA attack targeting the decryption with the CRT illustrated in FIG. 15 is disclosed in Japanese Patent No. 4086503. The disclosed attack targets the remainder processing with the prime number p or q performed at step 301 or 302. It depends upon the implementation form of the remainder processing of step 301 or 302 whether or not the attack succeeds. In the implementation form for succeeding, when Z=X mod Y is to be calculated, X and Y are compared with each other, and when X<Y, Y is output as a remainder result Z, and merely when X≧Y, a modular exponentiation Z=X (mod Y) is calculated to be output as described below. As the premise for holding the attack disclosed by Patent No. 4086503, an encryption device should perform the decryption with the CRT by employing this implementation. Specifically, the following processing is performed in the operation of Z=X (mod Y) in this method:                if (X<Y) then output X as Y        if (X≧Y) then calculate Z=X (mod Y) and output Z(this processing is hereinafter designated as “processing MOD_ALG”).        
In the processing MOD_ALG, the input X and the modulus Y are compared with each other, and the modular exponentiation is not executed when X<Y, and the modular exponentiation is executed merely when X≧Y. In other words, it is determined whether or not the modular exponentiation is to be executed in accordance with the relationship in magnitude between X and Y. If the attacker can observe the execution of the modular exponentiation by using power consumption, the relationship in magnitude between X and Y, that is, internal data of the encryption device, may be known in accordance with the power consumption. When this property is applied to step 301 or 302 of FIG. 15, the attacker can decrypt the prime number p or q. At step 301 or 302, the remainder processing with the prime number p or q is performed on the input ciphertext c. It is noted that in the implementation in an encryption device such as a smartcard, although the private key (dp, dq, p, q or u) is a value that is held within the device and cannot be externally input, the ciphertext c is a value that may be externally input by a third party. In other words, the attacker can determine whether c<p or c≧p with respect to the controllable ciphertext c by observing the power consumption in the remainder processing of step 301 or 302. When such determination is made, the prime number p can be easily obtained by using dichotomizing search illustrated in FIG. 16.
FIG. 16 illustrates an algorithm for narrowing candidate values for the prime number p by repeatedly halving a difference between the maximum value and the minimum value of p−ε with the minimum value of p−ε held as pmin and the maximum value of p−ε held as pmax.
In the above, ε is a parameter corresponding to the maximum value of a decision error occurring in the power analysis and ε≧0. The magnitude of the parameter ε depends upon the attacking method to be employed. The parameter ε changes in accordance with means for determining at step 404 whether or not pmid+ε<p. As the means for this determination, when it is determined whether pmid<p or pmid≧p by executing the SPA against the decryption with the CRT with CRT_DEC(pmid) input, ε=0. When the DPA described below is employed, the parameter ε is approximately 1000 times as large as that used in the SPA.
As illustrated at step 401, pmin is initialized to an initial value of 0 and pmax is initialized to an initial value of 2α (wherein α is the bit length of the prime number p). Thereafter, in a loop of steps 402 through 407, processing for narrowing the range of the prime number p by halving a difference between pmin and pmax is performed. This narrowing processing is performed by calculating a median value pmid of pmax and pmin and determining the relationship in magnitude between pmid and p through the attack using the power consumption.
As illustrated at step 403, the median value pmid of pmax and pmin is given as pmid=(pmax+pmin)/2. It is determined whether or not pmid+ε<p with respect to the thus given value pmid by the attack using the power consumption.
When pmid+ε<p is true, it means that pmid<p−ε<pmax. Therefore, while keeping pmax as the maximum value, pmid is set as a new minimum value of p−ε, and hence, processing of pmin:=pmid is performed at step 405.
When pmid+ε<p is false, it means that pmin≧p−ε. Therefore, while keeping pmin as the minimum value, pmid is set as a new maximum value of p−ε, and hence, processing of pmax:=pmid is performed at step 405 (whereas the symbol “:=” means that the result of the right side is substituted in the left side).
By repeating the above-described processing, the processing for halving a difference between the maximum value pmax and the minimum value pmin of the prime number p is repeated, and when the difference is as small as pmax−pmin≦π as illustrated at step 402, it is determined that the range of the prime number p has been sufficiently narrowed, and candidate values of the prime number p are output.
At step 408, processing for determining the maximum value and the minimum value of the prime number p on the basis of the range of p−ε sufficiently narrowed (to a difference not more than π) is performed. In the case where pmin≦p−ε≦pmax when ε≧o, the minimum value of the prime number p is pmin and the maximum value is pmax+ε, and therefore, the processing of pmax:=pmax+ε is executed with respect to the maximum value pmax of the prime number p. As a result of the processing, pmax−pmin≦π+ε.
At step 409, [pmin, pmin+1, . . . , pmax] are output as the candidate values of the prime number p and the processing is terminated. Since the number of candidate values of the prime number p is halved every time the loop of steps 403 through 407 is executed, the repeat of the loop is terminated in calculation time in proportion to α. For example, when the prime number p has a bit length of 1024 bits, the number of repeating the loop is 1024 at most, and thus, the prime number p can be very efficiently obtained.
(Outline of DPA)
Next, the DPA will be described. The DPA is an attack for guessing a private key used in an encryption device by observing a plurality of power waveforms and obtaining differences among the plural power waveforms. The DPA is effective in an environment where there is correlation between data read/written in an encryption device and power consumed in the read/write. It is known in general that power consumption has a property to be increased in proportion to the number of one's (1's) of binary data included in data read/written in an encryption device. In the DPA, this property is used for obtaining a private key.
(Power Analysis Attack 1 using DPA (targeting decryption without CRT): Attack 2)
Now, the power analysis attack by using the DPA targeting the decryption without the CRT (hereinafter designated as the attack 2) will be described. Among attacks using the DPA against the RSA cryptosystem, the most popularly known method is an attack for obtaining an exponent d by measuring power consumption in executing modular exponentiation of cd(mod n). This attack is effective against the decryption without the CRT illustrated in FIG. 14. When such a private key is revealed to an attacker, an arbitrary ciphertext can be decrypted, and hence, the security of the RSA cryptosystem cannot be retained. In other words, the private key d is a significant property to be protected from the attack by the SPA or the DPA similarly to the prime numbers p and q.
In order to make this attack succeed, the attacker is required to know the processing method of the modular exponentiation algorithm executed within the encryption device. The types of processing methods of modular exponentiation algorithm are basically roughly divided into the binary method and the window method and the types are very limited, and therefore, even when all attacking methods imaginable with respect to each of the types of the modular exponentiation algorithm are tried, it is merely several times as many efforts for the attacker, and hence, this requirement does not cause a serious problem for the attacker.
Assuming that the modular exponentiation algorithm implemented in an encryption device is the window method and that an attacker knows it, the attacking method for obtaining an exponent d on the basis of the power consumption in the modular exponentiation of cd(mod n) will be described. Although the window method is exemplarily employed in the following description, the DPA is effective against another modular exponentiation algorithm such as the binary method.
At this point, an operation by the window method and the DPA attack against the window method will be described. The modular exponentiation is a process for calculating v satisfying a relationship of m=cd (mod n) among an exponent d, a base c and a modulus n. As an algorithm for efficiently performing this process, the window method is known. Assuming that the binary expression of the exponent d is expressed as d=(du−1, du−2, . . . , d0)2, FIG. 17 illustrates an algorithm of the modular exponentiation for calculating m=cd(mod n) by the window method. The outline of the operation performed in FIG. 17 is illustrated in FIG. 18.
The operation of FIG. 17 will now be described. First, processing of creating a table w satisfying a relationship of w[x]=cx(mod n) is performed for 0<x<2k. After creating the table, u/k sequences bi(i=0, 1, . . . ) are created as block values obtained by dividing d=(du−1, du−2, . . . , d0)2 of u bits by every k bits, namely, blocks bi=(dik+k−1, . . . , dik)2. Table indexing processing by using each block bi(m:=m×w[bi]) and 2k multiplication of m:=m2^k(mod n) are repeated for calculating m=cd(mod n).
Now, a method in which an attacker guesses an exponent d used within an encryption device employing the window method by using the DPA will be described. In the RSA cryptosystem, the exponent d is a private key and is a significant property to be protected from an attacker. Since the exponent d generally has a value of 1024 or more bits, if the value is to be obtained by brute force approach, it takes 21024 efforts and hence is impossible. In the DPA, however, attention is paid to the processing of the window method for dividing the exponent d by every k bits. For example, in the processing illustrated in FIG. 18, the exponent d is divided into blocks bi by every 4 bits, and intermediate data of each block bi, that is, m:=m×w[bi](mod n), is calculated. Since the value of m:=m×w[bi] is read/written as internal data of an encryption device, the attacker can obtain information of the block bi by measuring power consumption in reading/writing the calculation result m of m:=m×w[bi]. The block bi is data as small as k bits (which is 4 bits in the exemplary case of FIG. 18), and therefore, when the brute force approach to the k-bit value bi is repeated with respect to all the bit values of the exponent d, the attacker can efficiently obtain the value of the exponent d. For example, when k=2 and d is a 2048-bit value, the exponent d is divided into 1024 2-bit blocks bi, and there is no need for the attacker to execute the brute force approach to all the bit values of 2048 bits but merely 2 bits, namely, four, kinds of brute force approaches are repeated 1024 times, and the number of necessary efforts is 4×1024=4096 alone, and thus, the value of the exponent d can be efficiently obtained.
In the brute force approach with respect to every k bits, it is necessary for the attacker to select a correct value out of 2kbi candidate values by the DPA, and the method for selecting a correct value will now be described.
For example, when k=2 and d=(d5, d4, d3, d2, d1, d0)2, divided blocks are b2=(d5, d4)2, b1=(d3d2)2 and b0=(d1d0)2, and in the modular exponentiation by the window method illustrated in FIG. 17, m=cd(mod n) is calculated through the following processing 1 through processing 5:m=1×w[b2](mod n)=cb2(mod n)   Processing 1m=(w[b2])4(mod n)=c4b2(mod n)   Processing 2m=((w[b2])4)×w[b1](mod n)=c4b2cb1(mod n)   Processing 3m=(((w[b2])4)×w[b1])4(mod n)=c16b2c4b1(mod n)   Processing 4m=(((w[b2])4)×w[b1])4×w[b0](mod n)=c16b2c4b1cb0 (mod n)=cd(mod n)   Processing 5
If the attacker knows that the encryption device implements the window method, the attacker also knows that the aforementioned processing 1 through 5 are performed in the encryption device. Therefore, values of b2, b1 and b0, that is, candidate values of bi, are guessed through the DPA performed as follows, so as to guess the value of the exponent d:
501: The encryption device is provided with N values ai (wherein i is 1, 2, . . . and N) as bases so as to cause it to calculate aid(mod n). Data of power consumed in the device at this point, i.e., power consumption data P(ai, time), is measured with respect to each value of i.
502: A 2-bit value b2 is predicted as a value b′2, and the following procedures (1) and (2) are repeated until it is determined that b2=b′2:
(1) With attention paid to intermediate data v of the processing 1, a value of m=aib2′(mod n) is simulated on the basis of the predicted value b′2, and the data P(ai, time)(wherein i=1, 2, . . . and N) is classified into two sets G1 and G0.G1=[P(ai, time)|least significant bit of aib′2(mod n)=1]G0=[P(ai, time)|least significant bit of aib′2(mod n)=0]
(2) A power difference curve Δ expressed as Δ=(average power of G1)−(average power of G0) is created on the basis of the sets G1 and G0. As a result, for example, in a time-power curve as illustrated in FIG. 19(A), when a spike as illustrated in FIG. 19(B) appears, it is determined that b2=b′2 (namely, b2 is successfully guessed), and when a substantially even curve as illustrated in FIG. 19(C) is obtained, it is determined that b2≠b′2.
503: A 2-bit value b1 is predicted as a value b′1, and the following procedures (1) and (2) are repeated until it is determined that b1=b′1:
(1) With attention paid to intermediate data v of the processing 3, a value of m=ai4b2aib1′(mod n) is simulated on the basis of the previously guessed value b2 and the predicted value b′1, and the data P(ai, time) (wherein i=1, 2, . . . and N) is classified into two sets G1 and G0.G1=[P(ai, time)|least significant bit of ai4b2aib′1(mod n)=1]G0=[P(ai, time)|least significant bit of ai4b2aib′1(mod n)=0]
(2) A power difference curve Δ expressed as Δ=(average power of G1)−(average power of G0) is created on the basis of the sets G1 and G0. As a result, when a spike as illustrated in FIG. 19(B) appears, it is determined that b1=b′1 (namely, b1 is successfully guessed), and when a substantially even curve as illustrated in FIG. 19(C) is obtained, it is determined that b1≠b′1.
504: A 2-bit value b0 is predicted as a value b′0, and the following procedures (1) and (2) are repeated until it is determined that b0=b′0:
(1) With attention paid to intermediate data v of the processing 5, and a value of m=ai16b2ai4b1aib0′(mod n) is simulated on the basis of the previously guessed values b2 and b1 and the predicted value b′0, and the data P(ai, time) (wherein i=1, 2, . . . and N) is classified into two sets G1 and G0.G1=[P(ai, time)|least significant bit of ai16b2ai4b1aib′0(mod n)=1]G0=[P(ai, time)|least significant bit of ai16b2ai4b1aib′0(mod n)=0]
(2) A power difference curve Δ expressed as Δ=(average power of G1)−(average power of G0) is created on the basis of the sets G1 and G0. As a result, when a spike as illustrated in FIG. 19(B) appears, it is determined that b0=b′0 (namely, b0 is successfully guessed), and when a substantially even curve as illustrated in FIG. 19(C) is obtained, it is determined that b0≠b′0.
When bi is correctly predicted, the value of m simulated by the attacker is generated also in the encryption device to be read/written, and therefore, since a differential power waveform in which the numbers of zero's (0's) and one's (1's) included in the value m are extremely biased between the sets G1 and G0 as in the aforementioned sets G1 and G0 is created, there arises a difference in the power consumption, and this difference in the power consumption is observed as a spike waveform as illustrated in FIG. 19(B).
When bi is incorrectly predicted, the value of m simulated by the attacker is not generated in the encryption device, and a value completely different from the simulated value is read/written, and therefore, even when a differential power waveform in which the numbers of zero's (0's) and one's (1's) included in the value m are extremely biased between the sets G1 and G0 as in the aforementioned sets G1 and G0 is created, a spike waveform cannot be obtained. When the prediction of bi is incorrect, the sets G1 and G0 are sets obtained by randomly classifying the whole set G of the data P(ai, time) (wherein i=1, 2, . . . , N) into two groups, and therefore, the average power consumption is substantially equivalent between the sets G1 and G0, resulting in a substantially even differential waveform as illustrated in FIG. 19(C).
(Power Analysis Attack 2 using DPA (targeting decryption with CRT): Attack 3)
Next, power analysis attack using the DPA targeting the decryption with the CRT (hereinafter designated as the attack 3) will be described. The attack using the SPA against the stage CRT-1 of the decryption with the CRT, namely, the modular exponentiation of a ciphertext (base) c using prime numbers p and q, has been already described. The DPA is also applicable to this processing. In the attack using the SPA, with respect to the base c controlled by an attacker and input to the encryption device, it is determined whether c≧p or c<p by using a single power consumption waveform. On the contrary, in the attack using the DPA, with respect to a base c input to the encryption device, it is determined whether or not c+ε<p by using a difference among a plurality of power consumption waveforms, whereas ε is an error parameter. When it is successfully determined that c+ε<p, candidate values of the prime number p can be narrowed by using the dichotomizing search illustrated in FIG. 16. Even when the search as illustrated in FIG. 16 is employed, however, the number of candidate values of the prime number p cannot be reduced to ε+π or smaller. When the number of candidate values of the prime number p is sufficiently small (of, for example, ε+π<240) for the brute force approach, however, the value ε+π does not cause a serious problem for narrowing the value of the prime number p.
The SPA attack against the stage CRT-1 described above is carried out on the assumption that the modular exponentiation algorithm represented by Z=X (mod Y) is performed in accordance with the processing MOD_ALG, namely, that the algorithm for switching the processing in accordance with the relationship in magnitude between X and Y is implemented, and on the other hand, the DPA attack described below is effective against an encryption device always executing the operation Z=X (mod Y) regardless of the relationship in magnitude between X and Y.
FIG. 20 illustrates an algorithm for determining, with respect to a parameter x controllable by an attacker, whether or not x+ε<p by using the DPA. Differently from the attack using the SPA, this determination is made not for obtaining accurate decision but for determining whether or not x+ε<p with respect to the error parameter ε. When the error parameter ε is too small, there is a possibility that accurate determination cannot be made depending upon the power consumption characteristic of the encryption device. This is because of the difference between the SPA where the determination is made by using a single power waveform and the DPA where the determination is made by using differences among a plurality of waveforms, and the error parameter ε is in proportion to the number of waveforms necessary for successfully performing the DPA. It is known in general that the DPA is successfully carried out by using differences among approximately 1000 pieces of data, and therefore, the error parameter ε has a value also as small as approximately 1000.
The principle for successfully performing the attack algorithm illustrated in FIG. 20 will be described. The result of the modular exponentiation represented by Z=X (mod Y) is always Z=X regardless of the implemented algorithm of the modular exponentiation when X<Y. Specifically, the value Z, that is, the output result Z to be read or written in the encryption device, is X (i.e., Z=X) when X<Y. In the above described sets G1 and G0, with respect to all bases ai represented as x≦ai<x+ε, when ai<p, namely, when x+ε<p, a value calculated as ai(mod p) is always ai, and this value is read/written in a memory within the encryption device. The numbers of zero's (0's) and one's (1's) included in the sets G1,j and G0,j as all the operation results of ai(mod p) are greatly biased with respect to all difference curves with j=0, 1, . . . and log2ε−1, and therefore, a spike as illustrated in FIG. 19(B) appears on power difference curves obtained as G1,j−G0,j with respect to all values of j. On the contrary, when ai≧p with respect to all bases ai represented as ai=x, x+1, . . . , x+ε, namely, when x≧p, the operation result of ai(mod p) is always ai−λip wherein λi is an integer. When the error parameter ε is sufficiently smaller than the prime number p, the integer λi is highly likely to be a constant λ regardless of the value of i, and therefore, the operation result of ai(mod p) is ai−λip. The value of ai and the 0th, 1st, . . . , or log2ε−1th bit value from the least significant bit of ai−λp are the same or different depending upon the influence of the propagation of carry through subtraction of λp. Specifically, the 0th, 1st, . . . , or log2ε−1th bit value from the least significant bit of ai−λp is not always the same as the 0th, 1st, . . . , or log2ε−1th bit value from the least significant bit of ai and is varied depending upon the values of ai and λp. In other words, a spike does not always appear on all the power difference curves obtained as G1,j−G0,j, but no spike appears or merely a spike with a small height appears depending upon the value of j, and a sufficiently high spike cannot be obtained with respect to all the values of j.
The same is true when, with respect to all bases ai represented as ai=x, x+1, . . . , x+ε, some ai satisfy ai≧p and the other ai satisfy ai<p, and also in this case, a spike does not appear with respect to all the values of j.
Accordingly, when a sufficiently high spike as illustrated in FIG. 19(B) appears on a power difference curve obtained as G1,j−G0,j, it can be determined that x+ε<p.
(Countermeasure against Power Analysis Attack)
Against the RSA cryptosystems illustrated in FIGS. 14 and 15, the attacking methods by the SPA or the DPA described as the attack 1, the attack 2 and the attack 3 above are known. Also, countermeasures against these attacks are known. Now, conventionally known two types of countermeasures (i.e., a countermeasure 1 and a countermeasure 2) against the attacks 1, 2 and 3 will be described.
(Countermeasure 1)
The countermeasure 1 is illustrated in FIG. 21. In FIG. 21, steps 1101 and 1102 correspond to the stage CRT-1, steps 1103, 1104, 1105 and 1106 correspond to the stage CRT-2, and steps 1107 and 1108 correspond to the stage CRT-3.
Constants R, Rp and Rq used in FIG. 21 are constants stored in an encryption device and have values not open to the public. Through the processing using these constants, the attacks 1 and 3 can be prevented.
Differently from the decryption method of FIG. 15, at steps 1101 and 1102, with respect to a new base c×R, which is obtained by multiplying a constant R satisfying R>p and R>q by c, modular exponentiation of c′p:=c×R (mod p) and c′q:=c×R (mod q) is executed. At steps 1103 and 1104, exponential modular exponentiations modulo p and q wherein bases are these c′p and c′q thus corrected by R and exponents are dp and dq are executed, and the result is stored as m′p and m′q. The resultant calculated values are m′p=(c×R)dp(mod p)=Rdp×cdp(mod p) and m′q=(c×R)dq(mod q)=Rdq×cdq(mod q). When these values are compared with mp=cdp(mod p) and mq=cdq(mod q), which are calculated through the modular exponentiation performed at steps 303 and 304 of FIG. 15, there is a difference derived from the constant Rdpor Rdq. Processing for correcting this difference for calculating cdp(mod p) and cdq(mod q) is executed at steps 1105 and 1106. This processing is executed by using previously calculated constants Rp=R−dp(mod p) and Rq=R−dq(mod q) through calculation of mp:=m′p×Rp(mod p)=cdp×Rdp×R−dp(mod p)=cdp(mod p) and mq:=m′q×Rq(mod q)=cdq×Rdq×R−dq(mod q)=cdq(mod q). The correction for mp=cdp(mod p) and mq=cdq(mod q) is processing to be performed for CRT composition performed at step 1107. When these values are provided as inputs for the CRT composition of step 1107, m :=((u×(mq−mp))(mod q))×p+mp=cd(mod n) is calculated to be output.
Through the countermeasure 1 illustrated in FIG. 21, the processing for executing the modular exponentiation with p and q after multiplication by the constant R is performed at steps 1101 and 1102, resulting in realizing the countermeasure against the attack 1. Since R is the constant satisfying R>p and R>q, relationships of c×R≧p and c×R≧q always hold excluding a case of a special input of c=0, and hence, in the calculation of Z=X (mod Y) of the processing (MOD_ALG), there always arises branching of X≧Y alone, and hence, the attacker cannot obtain effective information. Merely when c=0, branching of X<Y is caused, but this merely leads to obvious information of 0<p.
Accordingly, when the countermeasure 1 illustrated in FIG. 21 is employed, the attacker cannot obtain effective information about p through the branching processing of MOD_ALG, and thus, the attack 1 can be prevented.
Furthermore, the countermeasure 1 illustrated in FIG. 21 also exhibits an effect to prevent the attack 3 for the following reason: Since c×R (mod p) and c×R (mod q) are calculated at steps 1101 and 1102 by using the constant R unknown to an attacker, the attacker cannot guess the value of c×R about c and hence cannot guess the value of c×R (mod p) as well. If the value of R is known to the attacker, a similar attack can be executed by executing the attack 3 with c=g×R−1 (mod n) input instead of c. This is because the value calculated at step 1101 is c×R (mod p)=(g×R−1)×R (mod p)=g (mod p) in this case, and the modular exponentiation is executed with respect to g, which can be controlled by the attacker, and hence, the attacker can attain a situation similar to that in the attack 3. A relational expression of R−1(mod n)=R−1(mod p) is used in this case, and this relational expression is derived from a generally known property, about n=p×q and an arbitrary integer a, of a−1(mod n)=a−1 (mod p)=a−1(mod q). When R is an unknown constant, however, the attacker cannot calculate g×R−1(mod n) by using g, and hence, the countermeasure 1 attains security against the attack 3.
In other words, the security of the countermeasure 1 is attained on the assumption that the constants R, Rp and Rq have values unknown to an attacker. As long as these constants are unknown to the attacker, the security is retained but there is a potential risk as follows: when common constants are used in all solids of the encryption device, if these constants are revealed from one solid, there is a potential risk that the security of all the solids is endangered. Furthermore, when the countermeasure 1 is employed, since it is necessary to store the constants R, Rp and Rq within the device, cost of memory addition for recording these values is required. Since the constant R satisfies the relationships of R>p and R>q, a memory area with at least a bit length of p or q is necessary. Assuming that the bit length of p or q is a half of the bit length of n, the memory area necessary for the constant R is an area of (log2n)/2 bits. The memory area necessary for each of Rp and Rq is the same as that of p or q and is an area of (log2n)/2 bits. In total, the memory area necessary for storing the constants R, Rp and Rq is an area of 3(log2n)/2 bits. In general RSA cryptosystem, a value not less than 1024 bits is used as n, and therefore, a memory area of 1536 bits or more is necessary. Additional cost of the amount of computation is that of multiplication by R performed at steps 1101 and 1102 and that of multiplication by Rp and Rq performed at steps 1105 and 1106, but the additional cost of these amounts of computation occupies a very small proportion in the whole amount of computation and is negligibly small.
In summary, the countermeasure 1 can prevent the attacks 1 and 3. The additional cost necessary for the countermeasure 1 is the memory area for storing the constants R, Rp and Rq, and the necessary memory area is evaluated as 3(log2n)/2 bits (i.e., at least 1536 bits). Moreover, as a potential risk, when the constants R, Rp and Rq are commonly used in all solids of the encryption device, it is possible that the security of all the solids is endangered when these constants are revealed from one solid.
(Countermeasure 2)
A variety of countermeasures are known as a method for preventing the attack 2. All the countermeasures include, in common, processing of generating a random number within an encryption device in executing the calculation of cd(mod n) and randomizing intermediate data generated in the middle of the calculation of cd(mod n) by using a random number.
In the attack 2, an attacker simulates intermediate data created in the middle of the calculation of cd(mod n) based on the input c and creates the difference curve represented by G1−G0 on the basis of the simulation. Therefore, the simulation performed in the attack 2 is made invalid by randomizing the intermediate data obtained in the middle of the calculation, so as to prevent the attack 2. Although the intermediate data generated in the middle of the calculation of the modular exponentiation of cd(mod n) is randomized in this method, it is necessary to ultimately output the same value cd (mod n) as in the general modular exponentiation, and therefore, it is also necessary to release the randomization. As a countermeasure against the attack 2 through the randomization of the intermediate data, a variety of methods are known, which are different from one another in the method of randomizing and the method of releasing the randomization. Additional cost of the amount of computation and the memory necessary for the countermeasure depends upon the difference in these methods.
As a typical countermeasure against the attack 2, randomization of an exponent will now be described (as a countermeasure 2).
FIG. 22 illustrates a countermeasure against the attack 2 through the randomization of an exponent (i.e., the countermeasure 2). As a basic idea of this countermeasure, the randomization of an exponent used in the modular exponentiation is employed as the countermeasure against the attack 2. The randomization of an exponent is performed by using a randomized exponent d′=d+r×φ(n) instead of an exponent d, whereas r is a random number of 20 bits, φ(x) is an order against a modulus x, and the order against the modulus x has a property of aφ(x)(mod x)=1 with respect to an arbitrary integer a. When there is a relationship of n=p×q between prime numbers p and q, it is known that φ(n)=(p−1)(q−1), φ(p)=p−1 and φ(q)=q−1.
Since a bit string of the exponent d+r×φ(n) given by the random number r of 20 bits is randomly varied, the intermediate data obtained in the middle of the calculation of the modular exponentiation is randomized, but an ultimately calculated value is always equal to cd(mod n) (see FIG. 23). The ultimately calculated value is always equal to cd(mod n) because cd+r×φ(n)=cd×(cφ(n))r(mod n), and owing to the property of the order, cφ(n)=1(mod n) holds with respect to an arbitrary integer c, and therefore, cd+r×φ(n)=cd×(cφ(n))r(mod n)=cd×(1)r(mod n)=cd(mod n) holds with respect to an arbitrary random number r.
Additional cost, accompanying the countermeasure 2, of computation time is caused because d′=d+r×φ(n) is used instead of the exponent d. While the bit length of the exponent d is log2(n), the bit length of d′ is r×φ(N), which is given as login+20. The processing time necessary for the modular exponentiation is obtained as (bit length of modulus)×(bit length of modulus)×(bit length of exponent). When the countermeasure 2 is employed, the bit length of the exponent is increased from log2(n) to log2n+20, and therefore, the computation time is increased, as compared with the computation time when the countermeasure 2 is not employed, to (log2n+20)/(log2n). When log2n=1024, 1044/1024=1.02, and therefore, the computation time is slightly increased as the additional cost, but this increase occupies a very small proportion in the whole computation time. Therefore, the countermeasure 2 is known as an efficient countermeasure. As additional cost of a memory area, a 20-bit area for storing the random number r and a log2n-bit area for storing the order φ(n) that is not used in the decryption without the CRT illustrated in FIG. 14 are necessary.
In summary, the countermeasure 2 can prevent the attack 2. The additional cost of the amount of computation necessary for the countermeasure 2 corresponds to the cost of using the exponent d′=d+r×φ(n) instead of the exponent d, and the amount of computation is (log2n+20)/(log2n) times as large as that in the processing not employing the countermeasure illustrated in FIG. 14. When n has a 1024-bit value, however, the increased amount of computation is as small as 2%. As the additional cost of the memory area, a memory area of (20+log2n) bits in total is necessary for both the random number r and the order φ(n). Since n is generally a value of 1024 or more bits, an additional memory of 1044 bits or more is necessary.
(Summary of Countermeasure 1 and Countermeasure 2)
At this point, features of the conventionally known countermeasures 1 and 2 will be summarized. The countermeasure 1 (namely, the countermeasure for the decryption method of FIG. 15) is effective against the attacks 1 and 3, and the additional cost of the amount of computation is the same as that illustrated in FIG. 15 and the additional cost of the memory is 3 (log2n)/2 bits (≧1536 bits). Incidentally, when the constants R, Rp and Rq are commonly used in all solids, the countermeasure 1 has a problem that all the solids may be made vulnerable if the constants R, Rp and Rq are revealed. On the other hand, the countermeasure 2 (namely, the countermeasure for the decryption method of FIG. 14) is effective against the attack 2, and the additional cost of the amount of computation is (log2n+20)/(log2n) times as large as that of FIG. 14, and the additional cost of the memory is (20+log2n) bits (≧1044 bits).
(Problems of Countermeasures 1 and 2)
As described so far, the attacks described as the attacks 1, 2 and 3 are known against the RSA decryptosystems illustrated in FIGS. 14 and 15, and these attacks can be prevented by the conventional countermeasures described as the countermeasures 1 and 2. In other words, the conventionally known attacks 1, 2 and 3 can be prevented by the conventionally known countermeasures 1 and 2.
Incidentally, guess methods using the SPA or the DPA for the common key cryptosystem such as DES or AES and guess methods using the SPA or the DPA for the RSA cryptosystem or the public key cryptosystem such as elliptical curve cryptosystem are disclosed in documents mentioned below. Also, a decryption method highly secured against a side channel attack is also disclosed in documents mentioned below.                International Publication WO00/59157 pamphlet        Paul Kocher, Joshua Jaffe, and Benjamin Jun, “Differential Power Analysis”, in proceedings of Advances in Cryptology-CRYPTO '99, Lecture Notes in Computer Science vol. 1666, Springer-Verlag, 1999, pp. 388-397        Thomas S. Messerges, Ezzy A. Dabbish and Robert H. Sloan “Power Analysis Attacks of Modular exponentiation in Smartcards”, Cryptographic Hardware and Embedded Systems (CHES'99), Lecture Notes in Computer Science vol. 1717, Springer-Verlag, pp. 144-157        Jean-Sebastein Coron, “Resistance against Differential Power Analysis for Elliptic Curves Cryptosystems”, Cryptographic Hardware and Embedded Systems (CHES'99), Lecture Notes in Computer Science vol. 1717, Springer-Verlag, pp. 292-302, 1999        Alfred J. Menezes et al., “HANDBOOK OF APPLIED CRYPTOGRAPHY” (CRC press) pp. 615        