Cryptographic functions dealing with secret keys, such as, for example, block ciphers or message authentication codes, can be implemented either in software or in hardware on microelectronic data-processing devices such as, for example, Integrated Circuit (IC) chip cards (sometimes also referred to as “smart cards”).
During the execution of a generic cryptographic function, sensitive data depending on the secret key(s) are processed, sent over the internal links of the IC, and stored in the internal memories of the data-processing device.
In the attempt to prevent unauthorized people from fraudulently getting knowledge of the cryptographic secret key(s) or other sensitive information by tampering, tamper-resistant IC chips are produced: in such IC chips, special physical countermeasures are specifically provided for, in order to protect the underlying IC against tampering, such as, for example, protective layers and various sensors, detectors, and filters.
However, even in tamper-resistant IC chips, sensitive information may leak out through various side channels, such as, for example, by measuring signal timings, power consumption, and radiated electromagnetic energy, as well as by monitoring the signals by microprobing.
This leakage of information poses a serious problem: a good cryptographic function generally satisfies the requirement that it should be computationally infeasible to reconstruct the secret key from the knowledge of input/output data, but one such requirement is normally not necessarily satisfied if intermediate sensitive data, generated during the execution of the cryptographic function, are revealed.
Recovering the secret key from intermediate sensitive data that may leak out through any possible side channel during the execution of the cryptographic function is the objective of various cryptanalytic attacks, which are referred to as side-channel attacks. Therefore, there is a need to protect intermediate sensitive data that are generated during the execution of the cryptographic function which, when leaking out, may enable unauthorized third parties fraudulently reconstructing the secret key.
The side-channel attacks do not change the functionality of the device that implements the cryptographic process, and are typically not invasive. Power analysis attacks (proposed for example in P. Kocher et al., “Differential power analysis,” Advances in Cryptology—Crypto '99, Lecture Notes in Computer Science, vol. 1666, pp. 388-397, 1999) are very powerfill, as they do not require expensive resources; moreover, most implementations of cryptographic functions, especially in software, are vulnerable to such attacks, unless specific countermeasures are incorporated.
In particular, in the class of power analysis attacks, the so-called (first-order) Differential Power Analysis (DPA) attacks are especially practically important as they use a relatively simple statistical technique that is almost independent of the implementation of the cryptographic algorithm. They require measuring the power consumption of the cryptographic algorithm for a number of known inputs (or known outputs). Other, more sophisticated statistical analysis of power consumption curves may also be feasible.
The basis of power analysis attacks are elementary computations within the device used to implement the cryptographic function (the cryptographic device), computations which depend on the secret key information and on the known input and/or output information. If, in addition, the power consumption corresponding to these elementary computations depends on the values being computed, then the cryptographic device's power consumption curves contain information about the secret key, and such information may be extracted by statistical techniques, so as to reconstruct the secret key.
Software implementations of cryptographic functions, in which the operations are synchronized by the clock of the data processing unit, usually a microprocessor, running the algorithm that implements the cryptographic function, are especially vulnerable to power analysis attacks.
Hardware implementations of cryptographic functions are also potentially vulnerable to power analysis attacks, although a higher sampling frequency may be required for obtaining the power consumption curves.
A general algorithmic strategy to counteract power analysis attacks is to randomize the computations that depend on the secret key, by masking the original data with random masks, and by modifying the computations accordingly. This can be done for software or hardware implementations. An approach of this type, given in L. Goubin and J. Patarin, “DES and differential power analysis—The duplication method,” Cryptographic Hardware and Embedded Systems—CHFS '99, Lecture Notes in Computer Science, vol. 1717, pp. 158-172, 1999, proposes a data splitting technique to protect implementations of DES and other block ciphers against DPA attacks, where the input message as well as all intermediate data are each split into two parts, so that the original data can be recovered by the bitwise XOR or some other appropriate operation. The nonlinear parts of the algorithm, such as the S-boxes, are implemented by appropriate lookup tables of increased size (in Read Only Memory—ROM).
US patent application No. US 2001/0053220 A1 contains a similar proposal, except that the data parts can also be bit-permuted. The nonlinear parts of the algorithm, such as the S-boxes, can be implemented as lookup tables being updated accordingly (in Random Access Memory—RAM).
The Applicant points out that data splitting technique is essentially equivalent to random masking technique investigated in T. Messerges, “Securing the AES finalists against power analysis attacks,” Fast Software Encryption—FSE 2000, Lecture Notes in Computer Science, vol. 1978, pp. 150-164, 2001, except that in the latter, instead of performing duplicate computations on data shares, one performs a modified computation involving original data and random masks applied. All three mentioned approaches are primarily intended for software implementations.
An alternative way of dealing with power analysis attacks is making use of a special encoding of data that tends to balance the power consumption, for example, by representing the data through binary vectors with a fixed number of ones, such as the dual-rail encoding. In particular, US patent application No. US 2003/0140240 A1 describes a technique for protecting hardware implementations of cryptographic algorithms against power analysis attacks on the logic gate level, where the power consumption is balanced by encoding of all data by binary vectors with a fixed number of ones and by balancing the logic gate transitions.
In order to prevent the DPA attack on a microelectronic device implementing a cryptographic algorithm by means of a digital IC, it is sufficient to ensure that every elementary computation involving the secret information and performed by a logic gate is randomized. More precisely, the general condition to be satisfied is that the output value of each logic gate in the protected hardware design should have the same probability distribution for each given, fixed value of the secret key and input data. In other words, the output value of each logic gate in the digital IC that implements the cryptographic algorithm should be statistically independent of the secret key and input data. Here and throughout the present description this mathematical condition is referred to as the “secure computation condition”, and is first explicitly proposed in J. Golić, “DeKaRT: A new paradigm for key-dependent reversible circuits,” Cryptographic Hardware and Embedded Systems—CHES 2003, Lecture Notes in Computer Science, vol. 2779, pp. 98-112, 2003). The necessary uncertainty is provided by using purely random masks, preferably produced by a fast random number generator implemented in hardware, and integrated in the IC chip.
It is observed that a secure computation on the word level, in software, in general does not imply a secure computation on the bit level, in hardware, although the word-level security may provide more resistance to more sophisticated power analysis attacks such as, for example, the higher-order DPA attacks. In practice, the secure computation condition on the bit level is necessary for providing resistance to (the first-order) DPA attacks and is also likely to be sufficient, although individual logic gates do not achieve their final (random) values simultaneously and, in the transition stage, their output values may vary (randomly) and may depend on their previous inputs. This effect is also present in software implementations and, in fact, generally makes the power analysis of non-masked implementations more difficult, especially so for logic circuit implementations in hardware.
The masking operation that combines the data (input, output, and intermediate) with a random mask is typically adapted to the nature of the mathematical operations used in the cryptographic algorithm, because in this way the required modifications in the computations are minimized.
More precisely, let it be assumed that in some elementary computation in the cryptographic algorithm, x and y form the inputs to a logic gate, which combines together these inputs into an output z by using a group operation & according to: x & y=z (a group operation being, according to the group theory, an operation, defined on a set, that is associative, has an identity element and is such that every element of the set has an inverse element). Using any group operation for masking is sufficient to perfectly randomize the data; thus, let it be assumed that the inputs x and y are randomized by the same group operation & and by using the random masks rx and ry, respectively. Then, in view of:(x & rx)&(y & ry)=(x & y)&(rx & ry)=z &(rx & ry),the resulting output z is thus automatically randomized by the mask rz=rx & ry, so that the computation does not need not to be modified.
More generally, if z=ƒ(x,y), for a given function ƒ (not necessarily a group operation), and if it is desired to obtain a masked output z & rz from masked inputs x & rx and y & ry, then the function ƒ (and thus the computations) has to be modified into a new function h determined by h(x,y)=ƒ(x & rx, y & ry) & rz, and the problem is how to compute this function h securely.
Consequently, in the masked cryptographic algorithm, only the elementary computations different from the underlying group operation &, which is exploited for the masking process, have to be modified.
In general, the group operations on binary words most frequently used in cryptographic algorithms are the bitwise eXclusive OR (XOR) and the addition modulo an integer which is a power of 2. In view of the fact that the XOR of two binary values actually corresponds to their addition modulo 2, the bitwise XOR and the addition modulo 2n of two n-bit words x and y, are hencetoforth denoted as x+2y and x+ny, respectively.
The above-cited paper by J. Golic describes a general theoretical framework for the protection against DPA attacks by XOR random masking on the logic gate (i.e., hardware) level. A hardware technique, that is, a logic circuit for XOR random masking of a 2-input (MUltipleXer) MUX gate, with a control input selecting which one of the two data inputs is to be taken to the output, is proposed in U.S. Pat. No. 6,295,606 B1 and is meant to be used for masking lookup table implementations of Boolean functions to be used in cryptographic algorithms. However, the secure computation condition as defined above cannot be found in U.S. Pat. No. 6,295,606 B1. Another hardware technique for random masking of logic gates is proposed in German patent No. DE 10201449 C1, but the Applicant observes that the technique is flawed as it does not satisfy the secure computation condition defined above.
In many algorithms, the x+2y and x+ny operations, along with other Boolean and integer operations, are combined together for the cryptographic security. The best-known examples are the widely used cryptographic hash function SHA-1 (National Institute of Standards and Technology, FIPS Publication 180-1, Secure Hash Standard, 1994), the block cipher IDEA (X. Lai and J. Massey, “A proposal for a new block encryption standard,” Advances in Cryptology—Eurocrypt '90, Lecture Notes in Computer Science, vol. 473, pp. 398-404, 1991), and the block cipher RC6 (R. L. Rivest et al., “The RC6 block cipher,” v.1.1, August 1998, available at http://www.rsasecurity.com/rsalabs/rc6). The SHA-1 incorporates a secret key if it is used for message authentication, for example, in the so-called HMAC construction. In such algorithms, it is convenient to use both of the above-mentioned group operations (x+2y and x+ny) for random masking.
The random masking based on the addition modulo 2n is commonly called “arithmetic masking”, whereas the random masking based on the addition modulo 2 (or bitwise XOR) is commonly called “Boolean masking”.
Therefore, there is a need to convert between the two corresponding masks in a computationally secure way, that is, in a way secure against power analysis attacks such as DPA. Namely, given an n-bit data word x and an n-bit random masking word (random mask) r, the problem is to compute securely x+nr starting from x+2r, and vice versa.
Previously proposed solutions to the mask conversion problem are essentially software instead of hardware oriented, meaning that the elementary computations considered are based on words rather than individual bits. According to them it appears that the conversion from arithmetic masking to Boolean masking is inherently more difficult than the conversion from Boolean masking to arithmetic masking. More precisely, in L. Goubin, “A sound method for switching between Boolean and arithmetic masking,” Cryptographic Hardware and Embedded Systems—CHES 2001, Lecture Notes in Computer Science, vol. 2162, pp. 3-15, 2001, two solutions are proposed: one for the conversion from Boolean to arithmetic masking, and the other for the conversion from arithmetic to Boolean masking. The first solution requires seven n-bit word operations and an auxiliary n-bit random masking word, namely, five bitwise XOR operations and two subtractions modulo 2n. The second solution is much less efficient and requires 5(n+1) n-bit word operations and an auxiliary n-bit random masking word. For comparison, note that the direct conversion of the masks can be achieved by one +2 and one +n n-bit word operation, but is not computationally secure.
Another software-oriented solution for the conversion from arithmetic masking to Boolean masking is proposed in J.-S. Coron and A. Tcbulkine, “A new algorithm for switching from arithmetic to Boolean masking,” Cryptographic Hardware and Embedded Systems—CHES 2003, Lecture Notes in Computer Science, vol. 2779, pp. 89-97, 2003. The proposed solution requires certain precomputation and storage and some auxiliary random masking bits, but can be more efficient than the solution described above, depending on the processor word size.