An encryption system generally includes a public key cryptosystem and a common key cryptosystem. The common key cryptosystem uses the same secret key for both of encryption and decryption. Ciphertext can be securely transmitted by allowing the secret key to be shared between a user of a transmitter and a user of a receiver and by keeping the key secret from others. FIG. 1 shows an example of encryption with a common secret key in a smart card 10. In FIG. 1, the smart card 10 encrypts input plaintext with a common secret key in its encryption unit, in a well known manner, to provide ciphertext.
Analysis for decryption estimates secret information including a secret key from available information such as ciphertext and the like. A power analysis attack which is one way of the analysis for decryption is described in Paul Kocher, Joshua Jaffe, and Benjamin Jun, “Differential Power Analysis” in proceedings of Advances in Cryptology-CRYPTO'99, Springer-Verlag, 1999, pp. 388-397. The power analysis attack collects and analyses dissipated electric power data when different input data is provided to an on-board encryption processor of a device, such as a smart card, to estimate the key information in the encryption processor. This power analysis attack can be applied to both of the public key encryption and the secret key encryption.
The power analysis attack includes a simple power analysis (SPA) and a differential power analysis (DPA). The SPA estimates a secret key from the characteristics of a set of dissipated electric power data in the encryption processor. The DPA estimates a secret key by analyzing the differences between a number of sets of power data. Generally, the DPA is powerful.
For example, the DPA for the public key encryption such as the RSA and the like is described in Thomas S. Messerges, Ezzy A. Dabbish and Robert H. Sloan “Power Analysis Attacks of Modular Exponentitiation in Smartcards” Cryptographic Hardware and Embedded Systems (CHES'99), Springer-Verlag, pp. 144-157. The SPA and DPA for the DES (data encryption standard) which is the current standard of the common key cryptosystem are described in Paul Kocher, Joshua Jaffe, and Benjamin Jun, “Differential Power Analysis”, in proceedings of Advances in Cryptology-CRYPTO'99, Springer-Verlag, 1999, pp. 388-397. The DPA against the Rijndael method which can be a new standard of the common key cryptosystem is described in, for example, S. Chari, C. Jutla, J. R. Rao, P. Rohatgi, “An Cautionary Note Regarding Evaluation of AES Candidates on Smart-Cards”, Second Advanced Encryption Standard Candidate Conference, March 1999.
Thus, the DPA generates an interest as a particularly effective method for the power analysis attack, and different DPA methods for secret key analysis have been developed. On the other hand, techniques for protection against the DPA for secret key analysis have been developed.
Described below is a conventional typical configuration for the common key encryption to which the DPA can be applied. FIGS. 2, 3, and 4 show a key XOR (exclusive OR), a linear transform and a nonlinear transform, respectively, which are operations used in the typical common key encryption.
In FIG. 2, the key XOR provides a resultant output Zi of XORing input data Xi with key information Ki. (The operator “XOR” is represented by a symbol of a combination of “◯” and “+” in the attached drawings and mathematical formulas and equations herein.) In FIG. 3, the linear transform L provides a linear transformed output Zi=L(Xi) for input data Xi, where L(x XOR y)=L(x) XOR L(y) for arbitrary x and y. The linear transform includes bit permutation, a matrix operation and the like. In FIG. 4, the nonlinear transform W nonlinearly transforms the input data Xi to provide an output Zi=W(Xi), where W(x XOR y)≠W(x) XOR W(y) for arbitrary x and y. A typical nonlinear transform often employs nonlinear transform table SBoxes, divides an input X into the number, u, of elements as X={xu−1, . . . x1, x0} (where u is a natural number), uses Sboxes, wi's (i=0, 1, . . . u), to perform each operation zi=wi(xi), and produces an output Z as a combined value Z=(zu−1 . . . z1 z0).
In the typical common key encryption, each round function is configured by an appropriate combination of these key XOR, linear transform and nonlinear transform, and the round function is sequentially repeated for a plurality of rounds.
Described below is the technique of analysis for decryption in accordance with the DPA. The DPA includes a step of measuring dissipated power data and a step of analyzing a key based on the difference of dissipated power data. In measuring the dissipated power data, input plaintext containing a sequence of different codes is serially provided to an encryption device such as a smart card and the like, and change of dissipated electric power with time in its encryption processor in response to the input plaintext is measured by using an oscilloscope and the like, to thereby obtain a dissipated power curve. FIG. 7A shows an example of such a dissipated power curve. The measuring is performed for different plaintext inputs to collect a statistically sufficient number of dissipated power curves. A set G is defined herein as a set of dissipated power curves obtained by the measurement.
Described below is the analysis of a key using the dissipated power curves. FIG. 5 shows an example of encryption which is formed by a combination of the key XOR (FIG. 2) and the nonlinear transform (FIG. 4) in series connection. The DPA for the encryption is described below. FIG. 6 shows elements relevant to an arbitrary nonlinear transform element wi shown in FIG. 5. In FIG. 6, a value mi indicates a known multi-bit value within arbitrary input plaintext, a value ki (an element in K={ku−1, . . . k1, k0}) indicates an element value of an unknown key K, a function wi indicates an element transform function in a known SBox table, and a value zi(=wi(mi XOR ki)) indicates an output. For the DPA, the element value of the key used in the processor is assumed as an arbitrary value k1′. An operation zi′=wi(mi XOR ki′) is performed in accordance with the known mi and wi, and the assumed ki′, and the set G(ki′) for the assumed ki′ is divided into the following subsets G0(ki′) and G1(ki′).G0(ki′)={G| an e-th bit value in zi′=wi(mi⊕ki′) is 0}  (1),G1(ki′)={G| an e-th bit value in zi′=wi(mi⊕ki′) is 1}  (2),where “e” is a natural number indicating the e-th least significant bit.
Then, the following difference DG(ki′) between the dissipated power curves for the assumed kl′ is generated.DG(ki′)=(average dissipated power curve εG1)−(average dissipated power curve εG0)  (3)
FIG. 7A shows an example of an average dissipated power curve obtained from the dissipated power curves which belong to the set G1. FIG. 7B shows an example of an average dissipated power curve obtained from the dissipated power curves which belong to G0. If a value of the assumed key element is equal to a value of a corresponding true key element, i.e. ki′=ki, then a spike appears in the difference dissipated power curve as shown in FIG. 7C which represents the difference between the curves of FIG. 7A and FIG. 7B. If a value of the assumed key element is not equal to a value of a corresponding true key element, i.e. kl′≠ki, then the difference dissipated power curve as shown in FIG. 7D which represents the difference between the curves of FIG. 7A and FIG. 7B becomes a generally flat curve. Therefore, the key ki can be estimated from the difference dissipated power curve with the spike which is generated in accordance with the assumed ki′. By generating the difference dissipated power curves for the kl for all i's, the key K can be successfully analyzed or ultimately determined.
How a spike appears in the power difference curve DG(ki′) in the case of ki′=kl as a phenomenon is described below. If ki′=ki, then the assumed zi′=wi(mi XOR ki′) matches with a corresponding actual zi=wi(mi XOR ki) in the processor for all mi's. Thus, when the set G(ki′) is divided into the subsets G0(ki′) and G1(ki′) in accordance with the equations (1) and (2), the following equation (4) can be obtained using the Hamming weight HW of zi, where the Hamming weight is defined as the number of bits having a value of one in a binary value which represents a certain numerical value. For example, the Hamming weight HW of a binary 4-bit value (1101)2 is 3.(averaged HW of zi's for ziεG1)−(averaged HW of zi's for ziεG0)=1  (4)
On the other hand, if ki′≠ki, the assumed zi′=wi(mi XOR kl′) has no correlation with the corresponding actual zi=wi(mi XOR kl) in the processor. Thus, even if the set G(ki′) for all mi's is divided into the subsets G0 (ki′) and G1(ki′) in accordance with the equations (1) and (2) for the assumed zi′, it is actually divided into the two subsets at random for the respective actual zi's (i.e., the actual zi which has been assumed as zi′), and the following equation (5) is established.(average HW of zi's for ziεG1)−(average HW of zi's for ziεG0)≈0  (5)
When the equation (4) is established, there is a significant difference in average Hamming weights of the load values zi's between G0 (ki′) and G1 (ki′). When the equation (5) is established, there is no significant difference in average Hamming weights of the load values, zi's, between G0(kl′) and G1(ki′).
The transform wi represented by zi=wi(xi) is performed by reading in the output value zi of the transform table SBox from a memory such as a ROM, a RAM and the like within the encryption device in accordance with a load instruction. It is generally assumed that, the power proportional to the Hamming weight of a load value may be dissipated when the load instruction is executed. An experimental result showing the relevancy of the assumption is described in T. S. Messerge, Ezzy A. Dabbish and Robert H. Sloan, “Investigations of Power Attacks on Smartcards”, Proceedings of USENIX Workshop on Smartcard Technology, March 1999.
Thus, if ki′=ki, then the equation (4) is satisfied, and hence the significant difference of the dissipated power appears in the form of a spike in the difference power curve. In the case of the equation (5), however, the difference power curve has no spike and has a generally flat curve. It is known that the DPA can be also applied to an encryption device which has a configuration in which the linear transform L of FIG. 3 is incorporated into the device of FIG. 4.
FIG. 8 shows an encryption device having a configuration in which two linear transforms are added before and after the encryption device of FIG. 4. When L1 and L2 are assumed to be permutation functions and wi is assumed to be an SBox of the DES, the configuration of FIG. 8 is equivalent to the F function of the DES. For the specification of the DES, refer to FIPS 46, “Data encryption standard” Federal Information Processing Standards Publication 46, U.S. Department of Commerce/National Bureau of Standards, National Technical Information Service, Springfield, Va., 1977. The process in FIG. 8 can be converted to a process similar to the one as shown in FIG. 6, and hence the DPA can be applied to estimate a key K, similarly.
In the technique as described above, the DPA is applied to the SBox output in the process of nonlinear transform. There are further techniques of applying the DPA to a value of an XOR (an output of the key XOR) of the input mi with the key ki, and to the input value xl provided to the SBox. In a particular processor, the dissipated power as expressed by the following equation (6) in the adjacent bit model can be represented by a function of bits of a load value, to thereby obtain an effective analysis. This is reported in M. Akkar, R. Bevan, P. Dischamp, and D. Moyart, “Power Analysis, What Is Now Possible . . . ” Asiacrypt 2000.V(z)=a′+a0z0+a1z1+ . . . +a7z7+a0,1z0z1+a1,2z1z2+ . . . +a6,7z6z7  (6)
In accordance with the techniques described above, a secret key K is determined by the DPA in three cases or conditions 1-3 as described below. FIG. 9 shows measured points A, B and C for measuring the dissipated power curves in the encryption device of FIG. 5.    1. A case in which an input M is known and can be arbitrarily selected or controlled by an attacker, a key K has an unknown fixed value, and transforms of Sboxes, wi's, are known. In this case, the dissipated power curve is measured at predetermined timing at the measured point A (at the output of the SBox wl) shown in FIG. 9.    2. A case in which the input M is known and controllable, and the key K has an unknown fixed value. In this case, the dissipated power curve is measured at predetermined timing at the measured point B (at the output of the key XOR) shown in FIG. 9.    3. A case in which the input M is known and controllable, and the key K has an unknown fixed value. In this case, the dissipated power curve is measured at predetermined timing at the measured point C (at the load input for indexing an SBox, wi) shown in FIG. 9.
Conventional Protection Against DPA
Conventional countermeasure protection against the DPA includes, for example, a technique of reducing the measurement precision of the dissipated power by providing a noise generator in a smart card, and a technique of providing protection in an encryption algorithm. The technique of reducing the measurement precision can be easily implemented, but it is not a drastic measure because the analysis can be achieved by increasing the number of times of measurements. On the other hand, it may not be easy to provide protection in the encryption algorithm, which, however, can be a drastic measure. A typical technique of providing protection in the encryption algorithm is described in Thomas S. Messerges, “Securing the AES Finalists Against Power Analysis Attacks,” in proceedings of Fast Software Encryption Workshop 2000, Springer-Verlag, April 2000, which is called “a masking method”. The masking method performs each of encryption processes on a value M′ expressed by M′=M XOR R for an input value M and a random number R as a mask rather than on the input value M per se. Since the random number R is generated for each process of encryption, this method is referred to as a “random mask value method” hereinafter.
Described below is the random mask value method. FIG. 10 shows a schematic block diagram of the process in accordance with the random mask value method. This process includes an upper encrypting unit, a lower mask value generating unit, and a random number generator as shown in the figure.
When the conventional encrypting process in which the conventional key XOR function, the linear function, and the nonlinear function as shown in FIGS. 2, 3 and 4 are used is changed to the encrypting process shown in FIG. 10, they are replaced with a key XOR function, a linear function, and a nonlinear function as shown in FIGS. 11, 12 and 13A, respectively, in accordance with the random mask value method.
In the random mask value method, the computation of the conventional intermediate data Xi in the encryption is replaced with the computation of the Xi′ and the random number Ri which satisfy the exclusive OR, Xi=Xi′ XOR Ri. The encrypting unit computes Xi′, and the mask value generating unit computes Ri. The following equations (7) are established for Xi, Xi′, Zi, Zi′, Ri, and ROi in the operations shown in FIGS. 2 and 11, FIGS. 3 and 12, and FIGS. 4 and 13A.
                    {                                                            Xi                =                                                      Xi                    ′                                    ⊕                  Ri                                                                                                        Zi                =                                                      Zi                    ′                                    ⊕                  ROi                                                                                        (        7        )            
In FIG. 2, the XOR operation, Zi=Xi XOR Ki, is performed on the input value Xi with the key Ki. On the other hand, in FIG. 11, after the random number RKi is generated by the random number generator in the encrypting process, the double XOR operation, Zi′=Xi′ XOR Ki XOR RKi, is performed on the input value Xi′ and the key Ki. The XOR operation, ROi=Ri XOR RKi, is performed on the Ri with RKi in the mask value generating process.
In FIG. 3, the linear transform, Zi=L(Xi), is performed. On the other hand, the transform, Zi′=L(Xi′), is performed in the encrypting process shown in FIG. 12, and the transform, ROi=L(Ri), is performed in the mask value generating process.
In FIG. 4, a nonlinear transform is performed using the number, u, of SBoxes expressed by w1, w2, . . . wu−1. In the encrypting process shown in FIG. 13A, a new set of SBoxes expressed by wi′1, wi′2, . . . wi′u−1 are generated and stored in the RAM area by the process using a NewSBox unit, as shown in FIG. 13A, and a nonlinear transform is performed using these new SBoxes. In the mask value generating process shown in FIG. 13A, the process is performed using the NewSBox unit, and each of w′1, w′2, . . . w′u−1 is generated in accordance with the Ri and the internally generated random number ROi, to provide outputs w′1, w′2, . . . w′u−1 and ROi. FIG. 13B shows a detailed configuration of the NewSBox unit. The NewSBox generates ROi in accordance with the internal random number generator, generates wi′j for j=0, 1, . . . u−1 which satisfies wi′j(x)=w(x XOR rij) XOR roij, in accordance with Ri=riu−1 . . . ri1 ri0, ROi=roiu−1 . . . roi1 roi0, and the SBoxes, w1, w2, . . . wu−1, used in FIG. 13B, to provide outputs ROi and wi′j.
Described briefly below is the security of the random mask value method. In the random mask value method, the Sbox, wi′, of FIGS. 13A and 13B in each round shown in FIG. 10 and in FIG. 19 as described later changes in accordance with a random number. Thus, the content of the SBox can not be known by the DPA. That is, since the condition of the case 1 above that the SBox is known is not satisfied, the dissipated power curves measured at the predetermined timing at the measured point A shown in FIG. 8 can not be divided into G0 and G1 in accordance with the equations (1) and (2). Thus, the encryption device employing the random mask value method is secure against the DPA. Similarly, with respect to the conditions of the cases 2 and 3 above, the random element which changes each time in the measuring is combined at the measured point B at the output of the key XOR function and at the measured point C at the input to an Sbox. Thus, the condition that the key K is fixed is not satisfied. Thus it is secure against the DPA.
Described below is the Rijndael method as an example of the encryption employing the random mask value method. FIG. 14 shows a general configuration of a conventional N-round Rijndael type process without protection against the DPA. Each round of the N-round Rijndael process contains operations of an XOR, a Subbyte (substitute byte), a Shift or shifter and a Mixed column. The last round includes another XOR, but does not include a Mixedcolumn. In FIG. 14, the number N is determined in accordance with the number of bits of the secret key Ksec. If the Ksec has 128 bits, N is determined to be 10 (N=10). If it has 192 bits, N is determined to be 12 (N=12). If it has 256 bits, N is determined to be 14 (N=14). Ki is called a sub-key. FIG. 15 shows a sub-key generator for generating N+1 128-bit sub-keys, K0, K1, . . . KN, from 128/192/256-bit secret key Ksec in the Rijndael method. The method for generating sub-keys from a secret key is described in the specification of the Rijndael method accessible at http://www.nist.gov/aes/.
FIG. 16 shows a configuration of the Subbyte. This process performs a 128-bit-to-128-bit nonlinear transform using S's, each of which represents an 8-bit-to-8-bit transform SBox. FIG. 17 shows a configuration of the Shift. This process rearranges or reshuffles bytes in terms of byte positions. FIG. 18 shows a configuration of the Mixedcolumn. This process performs an operation in a matrix over the field GF (28).
FIG. 19 illustrates the N-round Rijndael method employing the random mask value method as opposed to the conventional N-round Rijndael method illustrated in FIG. 14. The N-round Rijndael method illustrated in FIG. 19 includes an upper N-round encryption unit, a lower N-round mask value generation unit, and a random number generator, as shown. Ki represents the sub-key of the i-th round in the Rijndael method. RKi represents a random mask value for each sub-key. The Subbyte performs a 128-bit-to-128-bit nonlinear transform using sixteen Sboxes, Si,0, Si,1, . . . Si,15, in the form as shown in FIG. 16. Si,0, Si,1, . . . Si,15 represent SBoxes generated by a new SBox unit “NewSBox” in the i-th round. FIG. 20 shows a configuration of the NewSBox, which generates sixteen different Sboxes, Si,0(x), Si,1(x), . . . Si,15(x), in response to an input value Rini in accordance with the internally generated random number Routi, to provide the random number Routi. The Shift and Mixedcolumn are linear transforms shown in FIGS. 17 and 18, similarly to those used in the process of the conventional Rijndael method.
The flow of the process of FIG. 19 is described below in Steps [1101] to [1109] for the encryption unit, and Steps [1201] to [1209] for the mask value generation unit as follows:    [1101] Set i=0.    [1102] Generate a random mask value Rin, and XOR the plaintext with Rin.    [1103] XOR the operated plaintext with (Ki XOR RKi). Provide a mask value RKi to the mask value generation to generate sixteen SBoxes: Si,j(x) (j=0, 1, . . . 15).    [1104] Perform the Subbyte on it, using the Si,j(x) generated at Step [1103].    [1105] Perform the Shift and Mixedcolumn on it.    [1106] i:=i+1    [1107] If i<N−1, then return to Step [1103]. Otherwise, proceed to the next step.    [1108] XOR it with KN−1, and provide RKN−1 to the mask value generation to generate sixteen SBoxes: SN−1,j(X) (j=0, 1, . . . 15).    [1109] Perform the Subbyte using SN−1,j(x), the Shift, and the XOR with KN on it.    [1110] XOR the operation output from Step [1109] with the output Rout from the mask value generation, and a resultant ciphertext is provided as an ultimate output.
The flow of the mask value generation:    [1201] Set i=0 and Mask=Rin, where Rin is a random mask value generated at Step [1102].    [1202] Perform the operation of Mask XOR RKi on the RKi received from the encryption, to generate a new Mask.    [1203] Produce sixteen Sboxes, Si,j(x) (j=0, 1, . . . 15), and the random number Routi, by providing the new Mask generated at Step [1202] to the NewSBox, to set Routi as a new Mask. The Si,j(x) is used in the Subbyte in the i-th round of the encryption.    [1204] The Mask is provided to the Shift and Mixed column, and the output from these operations is set to be a new Mask.    [1205] Set i:=i+1. If i<N−1, then return to Step [1202].    [1206] Perform the operation of Mask XOR RKN−1 on the input RKN−1 from the encryption. Then, set the operated result to be a new Mask.    [1207] By providing RinN−1 to the NewSBox, produce sixteen Sboxes, SN−1,j(x) (j=0, 1, . . . 15), and a random number RoutN−1. Then, set RoutN−1 to be a new Mask. SN−1,j(x) is used in the Subbyte in the (N−1)th round of the encryption.    [1208] Provide the Mask to the Shift. Then, the operated result is set to be a new Mask.    [1209] Perform the operation of Mask XOR RKN on the input RKN provided from the encryption, and provide the XOR output to thereby end the process.
Although it is known that the random mask value method has high security against the DPA, the encryption employing the random mask value method has drawbacks in that its encrypting speed is a few tenths lower than that of the conventional encryption and it requires a very large RAM area.
The encrypting speed is low as described above, because, in the XOR for example in the encrypting process, two intermediate values x and y are used to perform the operation z=x XOR y in the conventional implementation, while it is necessary in the random mask value method to derive the intermediate values x′ and y′ satisfying x′=x XOR Rx and y′=y XOR Ry to perform the operation z′=x′ XOR y′, and to perform the additional operation Rz=Rx XOR Ry for generating the new mask value related to the z′. For the nonlinear transform, nonlinear transform tables, called Sboxes, are held in a ROM in the conventional method, while nonlinear transform tables must be generated each time in accordance with a new mask value in the random mask value method, which requires a large amount of computations.
A large RAM area is required as described above, because, for the conventional nonlinear transform, the new Sboxes are stored in the RAM in each encryption process in the random mask value method, while the nonlinear transform tables are held in the ROM in the conventional method. For example, in the Rijndael method which uses an SBox for 8-bit-to-8-bit transform, a RAM area of at least 28=256 bytes is required to implement the random mask value method as protection against the DPA. However, for example, since a chip for a low cost smart card, such as an ST 16 (manufactured by ST Microelectron) has a RAM area of only about 128 bytes, it is practically impossible to implement the random mask value method.
It has been proposed to provide improvement of an apparent processing speed, reduction of the required RAM area and the like, by sharing mask values and by generating mask values between an encrypting process and the next encrypting process. However, since the masking with a random value is first performed in the entire process, it is impossible to achieve the improvement of the processing speed of the entire process and the reduction of the required RAM area.
The present inventors have recognized that it is advantageous to improve the processing speed and reduce the required RAM area by performing the masking with fixed values rather than the random values. The masking method using the fixed values is hereinafter referred to as a fixed mask value method.
An object of the present invention is to provide efficient protection of an encryption processor for encrypting data with a common key from analysis for decryption.
Another object of the present invention is to make it difficult to estimate a secret key, and to raise the security of the encryption processor.