This invention relates to a method for executing a cryptographic computation employing a cryptographic key, which is protected against spying out of the key via side-channel attacks.
Cryptographic computations are executed e.g. by general processors (CPUs), alternatively often by crypto-coprocessors, which are special processors associated with the general processors. In particular, chip cards for funds transfer applications or mobile radio applications frequently have processors (CPUs) with crypto-coprocessors. Many chip cards for funds transfer applications or mobile radio applications have crypto-coprocessors specifically designed for DES or AES (see next paragraph).
By a cryptographic computation, input data are processed to output data employing a secret key, e.g. plaintext data (input data) are encrypted to cipher data (output data) with a key or, conversely, cipher data (input data) decrypted to plaintext data (output data) with a key. Examples of symmetric (encryption key=decryption key) cryptographic computations are the algorithms DES (Data Encryption Standard) and AES (Advanced Encryption Standard).
In AES, which is subdivided into several rounds (e.g. 10, 12 or 14), the input data and the key are subdivided into blocks and processed block-wise. Within each round the data are processed byte-wise, so that, in detail, an input data byte (plaintext byte in encryption, or ciphertext byte in decryption) is respectively processed with a key byte.
Thus, e.g. in an AES encryption, which is shown schematically in extracts in FIG. 2, a plaintext P is encrypted to a ciphertext C with a key K. The plaintext P=pp . . . ppp consists of a sequence of plaintext bytes p, and the key K=kk . . . kkk of a sequence of key bytes k. At the input of each round (shown in extracts in FIG. 2 are the first two rounds R=1 and R=2) a sub-computation is executed, with the aid of which a DPA attack will be described later by way of example. In the sub-computation a plaintext byte p is XORed (p⊕k) with a key byte k, the result of the XORing inserted into a (substitution) table S (S-box), and an intermediate value x=S[p⊕k] thereby computed through a table access to the table S. The intermediate value x is fed within the round to further computation steps such as ShiftRow, MixColumn, which are not shown in FIG. 2. Key expansions and selection of key bits that are executed in AES are likewise not shown in FIG. 2. Table accesses are particularly endangered by DPA attacks.
Since the power consumption of a processor (e.g. in a chip card) depends on the processed data, cryptographic computations implemented on processors are susceptible to side-channel attacks, in which the time-resolved power consumption of the processor during the execution of the computation is measured. Usually, the power consumption is more precisely dependent on the Hamming weight of the data, i.e. on the number of ones in the data in the binary representation. The power consumption of the processor during the execution of the computation, plotted against the time elapsed during computation, is designated a power curve. Power curves of a processor for a computation are recorded for example by means of an oscilloscope.
In a DPA (differential power analysis) attack, sometimes also called a correlation power attack (CPA), a plurality of power curves (e.g. about 1000) are always recorded and synchronized for the same computation. In the computation, output data are computed with known input data and a secret key. For each possible value of the key a time-resolved correlation curve is computed between the synchronized power curves and the Hamming weight HW of the output data achieved with the respective key. The correlation curves for wrong keys consist of a more or less uniform noise, similar to the correlation curve shown in FIG. 1b. The correlation curve for the key with which the computation was executed has at a previously unknown time a statistical inconsistency in the form of a peak, similar to the correlation curve shown in FIG. 1a. 
In a DPA attack of a higher (second, third, fourth, . . . ) order, power curves are recorded at several (two, three, four, . . . ) times in the time course of the computation. [PRB10] describes a second-order DPA attack.
In a DPA attack on the sub-computation of AES from FIG. 2, a plaintext byte p of the known key K is XORed (p⊕k) with a key byte k of the secret key K, the result of the XORing inserted into the 256-byte substitution table S (S-box), and the intermediate value x=S[p⊕k] thereby computed through table access. The attacker records about 1000 power curves of the sub-computation, synchronizes them and computes for each possible value 0, . . . , 255 of the key byte k the correlation curve between the synchronized measured power curves and the Hamming weight HW(x) of the intermediate value x computed with the respective key byte k. The correlation curves for wrong key bytes consist of a more or less uniform noise, similar to the correlation curve shown in FIG. 1b. The correlation curve for the right key byte with which the computation was executed has a peak at a previously unknown time, similar to the correlation curve shown in FIG. 1a. This method is executed with each key byte k of the key K until the key K is reconstructed byte by byte.
According to in-house unpublished prior art, “00/ff masking” is provided as a countermeasure against DPA attacks. Here, an intermediate result of a sub-computation is randomly computed in the computation either directly, so that the intermediate result is generated, or in a complemented manner, so that the one's complement of the intermediate result is computed. As needed, the input data (plaintext or ciphertext) and the key must in so doing be complemented, and the output data (ciphertext or plaintext) be complemented at the output of the computation. For example, the computation is thus randomly so executed that a sub-computation x=S[p⊕k] is executed, or a complementing sub-computation x=S′[p⊕k] is executed, with a one's complemented S-box S′[x]=S[x] and one's complemented plaintext bytes and key bytes p, k. If about 1000 executions of the cryptographic computation are executed for an attack, computing is done statistically in respectively one half with the intermediate value x and the one's complement x. The Hamming weight HW(x) of the complementing sub-computation x==S′[p⊕k] is subject to the formula HW(w)=8−HW(x) in which the Hamming weight HW(x) of the non-complemented computation occurs with a negative sign. Therefore, correlations between the power curves and the computational result achieved with the key cancel each other out on average upon randomization. There thus results for randomized computing with the intermediate value x=S[p⊕k] or the complemented intermediate value x=S[p⊕k], even upon computing with the right key byte, a correlation curve like that shown in FIG. 1b, as arises without countermeasures only for a wrongly guessed key bit. Execution of the computation randomly with the intermediate value x=S[p⊕k] or the complemented intermediate value x=S′[p⊕k] is thus resistant to the above-described DPA side-channel attack with correlation computation. Moreover, 00/ff masking is very memory-saving.
The computation of the one's complement of a value is executed electively by XORing the value with hexadecimal FF or 0xff, the making available of the value itself being executed in this case electively by XORing with 0 at the same place in the course of the computation where complementing with FF or 0xff is done with the one's complement. By XORing with FF a value is complemented. By XORing with zero the value remains unchanged. The execution of an XORing in both cases conceals when the value is employed and when the one's complement of the value is employed. This way of executing the randomization results in the name “00/ff masking”.
In a varied DPA attack, there is computed instead of the correlation another statistic, for example the variance or a one-dimensional Kolmogorov-Smirnov statistic. Therefore, the statistical inconsistencies do not average out upon employment of the right key byte even when computing is done randomly with the intermediate value x=S[p⊕k] or the complemented intermediate value x=S′[p⊕k]. Thus, it is nevertheless recognizable when computing is done with the right key.
A further countermeasure against DPA attacks consists in XOR masking, which is described for example in DE 198 22 217 A1, and wherein input data E′=E⊕r XORed with a random number r are processed instead of input data E.
In XOR masking it is important that all intermediate values are always masked. If masking is done carelessly this is not the case. If in AES an encryption of a masked plaintext byte p′=p⊕r is executed with a masked key byte k′=k⊕r in a sub-computation, e.g. that from FIG. 2, this yields (p⊕r)⊕(k⊕r)=p⊕k=x, i.e. an unmasked encryption result or intermediate result x. It is obtained for example by different masking of input data and key that all intermediate results are masked, and unmasked intermediate results never occur. For example, the input data (plaintext or ciphertext) are masked with two random numbers r, s, the key is masked with only one random number r, and a compensation masking is executed. For example, the masking is executed according to (((p⊕r)⊕s)⊕(k⊕r)) (r⊕s)=p⊕k⊕r=x⊕r=xXOR. The plaintext byte p is thus masked with r and s, the key byte is masked only with s, and moreover the compensation masking executed with both random numbers r, s. Therefore, an intermediate result xXOR=x⊕r masked with r appears instead of the intermediate result x, cf. FIG. 3. A computation with careful XOR masking is resistant to the above-described DPA attack, regardless of the employed statistic (e.g. correlation, variance, one-dimensional Kolmogorov-Smirnov statistic). The intermediate result of the sub-computation from FIG. 2 is computed according to S′[x]=S[x⊕r]⊕r. It is evident from this that a separate substitution table (S-box) S′ is required for each value of the random number r. Electively, a substitution table (S-box) S′ masked with r is stored (e.g. in the chip card or the processor) for every possible value of the random number r, e.g. 256 different 256-byte tables for a 256-byte substitution table (S-box). Alternatively, the substitution table (S-box) is only computed after the specification of the random number r, preferably in a working memory RAM associated with the processor. However, this costs computing time and possibly memory space in the working memory RAM. In the course of the execution of the algorithm, e.g. of the AES, and as soon as the random number is specified, the same substitution table is always employed, in contrast. In the example from FIG. 2 the same substitution table is thus employed for all plaintext bytes p of the plaintext P and key bytes k of the key K. Therefore, in the sub-computation from FIG. 2, executed for different plaintext bytes p, p′ and key bytes k, k′, i.e. at different times t, t′ in AES, the computed intermediate values xXOR=S[p⊕k]⊕r and x′XOR=S[p′⊕k′]⊕r are masked by the same random number r, which is a weak point.
The fact that in XOR masking the same table (e.g. AES S-box) is employed in all table calls can be utilized for a higher-order DPA attack, as is described for AES e.g. in [PRB10]. In the second-order DPA attack from [PRB10] the power consumption j(t), j(e) is measured at two times t, t′ and normalized to j(t), j(t′). The correlation between the product j(t)·j(t′) and the Hamming weight of the value x⊕x′=S[p⊕k]⊕S[p′⊕k′] is computed for every possible key pair k, k′ and every pair of times t, t′ therefor. For the right key there results a correlation curve like that shown in FIG. 1a, with the significant peak at one time. The computation effort for the second-order attack from [PRB10] rises quadratically with the number of times and keys to be taken into consideration and is thus considerable.
A further countermeasure against DFA attacks, which is described e.g. in WO 01/31422 A2, is affine masking, which is a variation of XOR masking and aims to protect a cryptographic computation against higher-order DPA attacks. In affine masking according to WO 01/31422 A2 the masking is varied for data by masking data masked with random data before each utilization with further random data randomly selected for each utilization. As data there can be provided for example secret data (e.g. keys) or intermediate data. The sub-computation from FIG. 2 with affine masking delivers instead of an intermediate result x an intermediate result xα=a·x⊕r.
EP 2 302 552 A1 describes a method for executing a cryptographic algorithm, which is protected by an affine masking, with the aim of warding off higher-order (higher than 1) DPA attacks. In so doing, secret data or intermediate results x of the algorithm are masked employing an invertible binary random matrix R and a random number r according to m(x)=R·x⊕r.
Computations with affine masking are also subject to DPA attacks. The latter are based on the connection that if x=x′ for two intermediate values x, x′ of the computation, a·x⊕r=a·x′⊕r as well, and vice versa. By forming a correlation or by other statistical methods over power curves that were recorded on an affinely masked computation it can be established whether or not two masked intermediate values xα, x′α match. The unmasked intermediate values x, x′ then accordingly match or do not.
A conceivable, comparatively secure measure against higher-order DPA attacks is to decompose each intermediate value into several parts. An example of this is second-order XOR masking, in which for each intermediate value x two random numbers r, s are specified and computations are always executed only with x⊕r⊕s, r and s, but not with the unmasked value x or simply masked values x⊕r, x⊕s, r⊕s. Such a decomposition considerably increases the effort for implementing the computation and the computing time of the computation.
For a computation in which masked tables (e.g. XOR-masked ones) are employed—for example AES with the S-boxes as tables—there are generic measures against DPA attacks, whereby table accesses are concealed. For example, before each table call from a table the table can be newly generated in the working memory (RAM) of the computing processor, with a new (XOR) masking with a new masking parameter, e.g. with a new random number. In so doing, at each individual table call a new masking operation must be executed with which e.g. a table like that shown in FIG. 4 is converted to a table like that shown in FIG. 10. In a conventional XOR masking, as stated hereinabove, an XOR-masked table is computed at the beginning of the computation (e.g. AES), and the same table is employed for all table calls within the computation (e.g. AES). Computing a new XOR-masked table for each table call increases the computation effort to such an extent, on the other hand, that the runtime of the computation becomes unacceptable for e.g. chip cards.
Alternatively, for each table call, to establish a single table value, all table entries can be evaluated and only the right one be kept. The computing time of the computation can increase to about the hundredfold value through such generic measures, which is unacceptable e.g. for chip cards.
The article [SP06] describes maskings for tables against higher-order DPA attacks, which are suitable for AES tables and other tables. The article [RDP08] describes measures for protecting tables against higher-order side-channel attacks. For the sake of completeness, the articles [GPQ11] and [KHL11] are mentioned, which describe special methods for concealing table accesses in AES. [GPQ11] describes inversion by means of an exponentiation employing the equality x−1=x254 in F256x). [KHL11] describes transforming an XOR masking into a multiplicative masking before inversion and retransforming after inversion employing (a·x)=a−1·x−1, as well as reducing the inversion in F256 to the subfield F16 and the inversion of 2×2 matrices over F16.