The present invention relates to a countermeasure method in an electronic component using a secret key cryptography algorithm. They are used in applications where access to services or data is strictly controlled. They have an architecture formed around a microprocessor and memories, including a program memory which contains the secret key.
These components are notably used in chip cards, for certain applications thereof. These are for example applications involving access to certain data banks, banking applications, remote payment applications, for example for television, petrol dispensing or passing through motorway tolls.
These components or cards therefore use a secret key cryptography algorithm, the best known of which is the DES (standing for Data Encryption Standard in the British and American literature) algorithm. Other secret key algorithms exist, such as the RC5 algorithm or the COMP128 algorithm. This list is of course not exhaustive.
In general terms and briefly, the function of these algorithms is to calculate an encoded message from a message applied as an input (to the card) by a host system (server, banking dispenser etc) and the secret key contained in the card, and to supply this encoded message in return to the host system, which for example enables the host system to authenticate the component or card, to exchange data, etc.
However, it has become clear that these components or cards are vulnerable to attacks consisting of a differential analysis of the current consumption and which enable ill-intentioned third parties to find the secret key. These attacks are referred to as DPA attacks, the English acronym for Differential Power Analysis.
The principle of these DPA attacks is based on the fact that the current consumption of the microprocessor executing the instructions varies according to the data being manipulated.
Notably, an instruction from the microprocessor manipulating a data bit generates two different current profiles depending on whether this bit is “1” or “0”. Typically, if the instruction is manipulating a “0”, there is at this time of execution a first amplitude of the current consumed and if the instruction is manipulating a “1”, there is a second amplitude of the consumed current, different from the first.
The characteristics of the cryptography algorithms are known: the calculations made, the parameters used. The only unknown is the secret key contained in the program memory. This cannot be derived solely from knowledge of the message applied as an input and the encoded message supplied in return.
However, in a cryptography algorithm, some calculated data depend only on the message applied in clear to the input of the card and the secret key contained in the card. Other data calculated in the algorithm can also be recalculated solely from the encoded message (generally supplied in clear at the output of the card to the host system) and the secret key contained in the card. More precisely, each bit of these particular data can be determined from the input or output message, and a limited number of particular bits of the key.
Thus, to each bit of a particular data item, there corresponds a sub-key formed by a particular group of bits of the key.
The bits of these particular data which can be predicted are hereinafter referred to as target bits.
The basic idea of the DPA attack is thus to use the difference in current consumption profile of an instruction depending on whether it is manipulating a “1” or a “0” and the possibility of calculating a target bit by means of the instructions of the algorithm using a known input or output message and a hypothesis on the corresponding sub-key.
The principle of the DPA attack is therefore to test a given sub-key hypothesis, applying, to a large number of current measurement curves, each relating to a known input message of the attacker, a Boolean selection function, a function of the sub-key hypothesis, and defined for each curve by the value predicted for a target bit.
By making an assumption on the sub-key concerned, it is in fact possible to predict the value “0” or “1” which this target bit will take for a given input or output message.
It is then possible to apply, as a Boolean selection function, the value, “0” or “1”, predicted by the target bit for the sub-key hypothesis in question, in order to sort these curves into two packets: a first packet contains the curves which have seen the manipulation of the target bit at “0” and a second packet contains the curves which have seen the manipulation of the target bit at “1” according to the sub-key hypothesis. By taking the mean of the current consumption in each packet, a mean consumption curve M0(t) is obtained for the first packet and a mean consumption curve M1(t) for the second packet.
If the sub-key hypothesis is correct, the first packet actually contains all the curves amongst the N curves which have seen the manipulation of the target bit at “0” and the second packet actually contains all the curves amongst the N curves which have seen the manipulation of the target bit at “1”. The mean consumption curve M0(t) of the first packet will then have a mean consumption everywhere except at the times of execution of the critical instructions, with a current consumption profile characteristic of the manipulation of the target bit at “0” (profile0). In other words, for all these curves, all the manipulated bits have had as many chances of equalling “0” as of equalling “1”, except the target bit, which has always had the value “0”. Which can be written:M0(t)=[profile0+profile1)/2]t≠tci+[profile0]tcl that is to sayM0(t)=[Vmt]t≠tci+[profile0]tci 
where tci represents the critical instants, at which a critical instruction has been executed.
Likewise, the mean consumption curve M1(t) of the second packet corresponds to a mean consumption everywhere except at the times of execution of the critical instructions, with a current consumption profile characteristic of the manipulation of the target bit at “1” (profile1). It is possible to write:M1(t)=[profile0+profile1)/2]t≠tci+[profile1]tci that is to sayM1(t)=[Vmt]t≠tci+[profile1]tci 
It has been seen that the two profiles, profile0 and profile1, are not equal. The difference between the curves M0(t) and M0(1) then gives a signal DPA(t), whose amplitude is equal to profile0-profile1 at the critical instants tci of execution of the critical instructions manipulating this bit, that is to say, in the example depicted in FIG. 1, at the places tc0 to tc6, and whose amplitude is approximately equal to zero outside the critical instants.
If the sub-key hypothesis is false, the sorting does not correspond to reality. Statistically, there is then in each packet as many curves which have actually seen the manipulation of the target bit at “0” as there are curves which have seen the manipulation of the target bit at “1”. The resulting mean curve M0(t) is then situated around a mean value given by (profile0+profile1)/2=Vm, since, for each of the curves, all the bits manipulated, including the target bit, have as many chances of equalling “0” as of equalling “1”.
The same reasoning on the second packet leads to a mean current consumption curve M1(t) whose amplitude is situated around a mean value given by (profile0+profile1)/2=Vm.
The signal DP(t) supplied by the difference M0(t)−M1(t) is in this case substantially equal to zero. The signal DPA(t) in the case of a false sub-key hypothesis is shown in FIG. 2.
Thus the DPA attack exploits the difference in the current consumption profile during the execution of an instruction depending on the value of the bit manipulated, in order to effect a sorting of current consumption curves according to a Boolean selection function for a given sub-key hypothesis. By effecting a differential analysis of the mean current consumption between the two packets of curves obtained, an information signal DPA(t) is obtained.
A DPA attack then consists overall in:
a—drawing N random messages (for example N equal to 1000);
b—having the algorithm executed by the card for each of the N random messages, reading the current consumption curve each time (measured on the supply terminal of the component);
c—making an assumption on a sub-key;
d—predicting, for each of the random messages, the value taken by one of the target bits whose value depends only the bits of the message (input or output) and on the sub-key taken as a hypothesis, in order to obtain the Boolean selection function;
e—sorting the curves according to this Boolean selection function (that is to say according to the value “0” or “1” predicted for this target bit for each curve under the sub-key hypothesis);
f—calculating, in each packet, the resulting mean current consumption curve;
g—taking the difference between these mean curves, in order to obtain the signal DPA(t).
If the hypothesis on the sub-key is correct, the Boolean selection function is correct and the curves of the first packet actually correspond to the curves for which the message supplied as an input or output gave a target bit at “0” in the card and the curves in the second packet actually correspond to the curves for which the message applied as an input or output gave a target bit at “1” in the card.
Take the case in FIG. 1: the signal DPA(t) is therefore not zero at times tc0 to tc6 corresponding to the execution of the critical instructions (those which manipulate the target bit).
It should be noted that the attacker does not need to know precisely the critical instants. It suffices for there to have been at least one critical instant in the period of acquisition.
If the sub-key hypothesis is not correct, the sorting does not correspond to reality and there are then in each packet as many curves corresponding in reality to a target bit at “0” as there are curves corresponding to a target bit at “1”. The signal DPA(t) is substantially zero everywhere (the case shown in FIG. 2). It is necessary to return to step c— and to make a new assumption on the sub-key.
If the hypothesis proves correct, it is possible to pass to the evaluation of other sub-keys, until the key has been reconstituted to the maximum possible extent. For example, with a DES algorithm, a key of 64 bits is used, of which only 56 are useful bits. With a DPA attack, it is possible to reconstitute at least 48 bits of the 56 useful bits.