The present invention relates to a countermeasure method in an electronic component using a secret-key cryptographic algorithm. Such components are used in applications where access to services or data is strictly controlled. They have an architecture formed around a microprocessor and memories, including a program memory which contains the secret key.
These components are notably used in smart cards, for certain applications thereof. These are for example applications concerning access to certain databanks, banking applications, or remote payment applications, for example for television, petrol dispensing or passing through motorway tolls.
These components or cards therefore use a secret-key cryptographic algorithm, the best known of which is the DES (standing for Data Encryption Standard in English and American literature) algorithm. Other secret-key algorithms exist, such as the RC5 algorithm or the COMP128 algorithm. This list is of course not exhaustive.
In general terms and succinctly, the function of these algorithms is to calculate an enciphered message from a message applied as an input (to the card) by a host system (server, bank dispenser etc) and the secret key contained in the card, and to supply this enciphered message in return to the host system, which for example enables the host system to authenticate the component or the card, to exchange data etc.
However, it has become clear that these components or cards are vulnerable to attacks consisting of a differential analysis of current consumption and which enable ill-intentioned third parties to find the secret key. These attacks are known as DPA attacks, the English acronym for Differential Power Analysis.
The principle of these DPA attacks is based on the fact that the current consumption of the microprocessor executing instructions varies according to the data being manipulated.
Notably, an instruction of the microprocessor manipulating a data bit generates two different current profiles depending on whether the bit is equal to “1” or “0”. Typically, if the instruction is manipulating a “0”, there is, at this moment of execution, a first magnitude of the current consumed and, if the instruction is manipulating a “1”, there is a second magnitude of the current consumed, different from the first.
The characteristics of the cryptographic algorithms are known: calculations made, parameters used. The only unknown is the secret key contained in the program memory. This cannot be deduced solely from knowledge of the message applied as an input and the enciphered message supplied in return.
However, in a cryptographic algorithm, certain calculated data depend solely on the message applied in clear at the input to the card and the secret key contained in the card. Other data calculated in the algorithm can thus be recalculated solely from the enciphered message (generally supplied in clear at the output from the card to the host system) and the secret key contained in the card. More precisely, each bit of these particular data can be determined from the input or output message, and from a limited number of particular bits of the key.
Thus, to each bit of a particular data item there corresponds a subkey formed by a particular group of bits of the key.
The bits of these particular data which can be predicted are hereinafter referred to as target bits.
The basic idea of the DPA attack is thus to use the difference in current consumption profile of an instruction depending on whether it is manipulating a “1” or “0” and the possibility of calculating a target bit through the instructions of the algorithm from a known input or output message and a hypothesis on the corresponding subkey.
The principle of the DPA attack is therefore to test a given subkey hypothesis, applying, to a large number of current measurement curves, each relating to a known input message of the attacker, a Boolean selection function, according to the subkey hypothesis, and defined for each curve by the value predicted for a target bit.
By forming a hypothesis on the subkey concerned, it is in fact possible to predict the value “0” or “1” which this target bit will take for a given input or output message.
It is then possible to apply, as a Boolean selection function, the value, “0” or “1” predicted by the target bit for the subkey hypothesis in question, in order to sort these curves into two packets: a first packet includes the curves which have seen the manipulation of the target bit at “0” and a second packet contains the curves which have seen the manipulation of the target bit at “1” according to the subkey hypothesis. By taking the average current consumption in each packet, a mean consumption curve M0(t) for the first packet and a mean consumption curve M1(t) for the second packet are obtained.
If the subkey hypothesis is correct, the first packet actually contains all the curves amongst the N curves which have seen the manipulation of the target bit at “0” and the second packet actually contains all the curves amongst the N curves which have seen the manipulation of the target bit at “1”. The mean consumption curve M0(t) of the first packet will then have a mean consumption everywhere except at the moments of execution of the critical instructions, with a current consumption profile characteristic of the manipulation of the target bit at “0” (profile0). In other words, for all these curves all the bits manipulated have had as many chances of equalling “0” as of equalling “1”, except for the target bit, which has always had the value “0”. This can be written:M0(t)=[profile0+profile1)/2]t tci1+[profile0]tc1 that is to sayM0(t)=[Vmt]t tci1=[profile0]tc1 
where tci represents the critical moments, at which a critical instruction has been executed.
Likewise, the mean consumption curve M1(t) of the second packet corresponds to a mean consumption everywhere except at the moments of execution of the critical instructions, with a current consumption profile characteristic of the manipulation of the target bit at “1” (profile1). It is possible to write:M1(t)=[(profile0+profile1)/2]t tci1+[profile1]tci that is to sayM1(t)=[Vmt]t tci1+[profile1]tci 
It has been seen that the two profiles profile0 and profile1 are not equal. The difference between the curves M0(t) and M1(t) then gives a signal DPA(t) whose magnitude is equal to profile0-profile1 at the critical moments tci of execution of the critical instructions manipulating this bit, that is to say, in the example depicted in FIG. 1, at the points tc0 to tc6 and whose magnitude is approximately equal to zero outside the critical moments.
If the subkey hypothesis is wrong, the sorting does not correspond to reality. Statistically, there are then in each packet as many curves which have actually seen the manipulation of the target bit at “0” as there are curves which have seen the manipulation of the target bit at “1”. The resulting mean curve M0(t) is then situated around a mean value given by (profile0+profile1)/2=Vm, since, for each of the curves, all the bits manipulated, including the target bit, have as many chances of equalling “0” as of equalling “1”.
The same reasoning on the second packet leads to a mean current consumption curve M1(t) whose magnitude is situated around a mean value given by (profile0+profile1)/2=Vm.
The signal DPA(t) supplied by the difference M0(t)−M1(t) is in this case substantially equal to zero. The signal DPA(t) in the case of a wrong subkey hypothesis is depicted in FIG. 2. Thus the DPA attack exploits the difference in the current consumption profile during the execution of an instruction according to the value of the bit manipulated, in order to effect a sorting of current consumption curves according to a Boolean selection function for a given subkey hypothesis. By effecting a differential analysis of the mean current consumption between the two packets of curves obtained, an information signal DPA(t) is obtained.
The execution of a DPA attack then consists overall of:
a—drawing N random messages (for example N equal to 1000);
b—making the card execute the algorithm for each of the N random messages, reading the current consumption curve on each occasion (measured on the supply terminal of the component);
c—forming a hypothesis on a subkey;
d—predicting, for each of the random messages, the value taken by one of the target bits whose value depends only on the bits of the message (input or output) and on the subkey taken as a hypothesis, in order to obtain the Boolean selection function;
e—sorting the curves according to this Boolean selection function (that is to say according to the value “0” or “1” predicted for this target bit for each curve under the subkey hypothesis);
f—calculating in each packet the resulting mean current consumption curve;
g—effecting the difference of these mean curves, in order to obtain the signal DPA(t).
If the hypothesis on the subkey is correct, the Boolean selection function is correct and the curves of the first packet actually correspond to the curves for which the message applied as an input or output has given a target bit at “0” in the card and the curves of the second packet actually correspond to the curves for which the message applied as an input or output has given a target bit at “1” in the card.
The case in FIG. 1 applies: the signal DPA(t) is therefore not zero at instants tc0 to tc6 corresponding to the execution of the critical instructions (those which manipulate the target bit). It suffices for there to be at least one critical instant in the acquisition period.
It should be noted that the attacker does not need to know precisely the critical instants.
If the subkey hypothesis is not correct, the sorting does not correspond to reality and there are therefore in each packet as many curves corresponding in reality to a target bit at “0” as there are curves corresponding to a target bit at “1”. The signal DPA(t) is substantially nil throughout (the case depicted in FIG. 2). It is necessary to return to step c and to form a new hypothesis on the subkey.
If the hypothesis proves correct, the evaluation of other subkeys can be passed to, until the key has been reconstituted to the maximum possible extent. For example, with a DES algorithm, a 64-bit key is used, of which only 56 are useful bits. With a DPA attack, it is possible to reconstitute at least 48 bits of the 56 useful bits.