Existing encryption algorithms, involving ciphering of sensitive data, provide effective robustness against cryptanalysis and contents recovery attacks. These techniques are called “black box techniques”, as the attacker only has knowledge of the inputs and outputs of the encryption algorithm. Most of the encryption algorithms are standardised, one of the most used being known is the Advanced Encryption Standard (AES). The confidentiality of the encryption is based on a shared secret ciphering key. The best option for an attacker ignoring the secret key is to try all the possible combinations (brute force decoding). When the key is 128 bits or 256 bits long, the number of iterations required makes a brute force decoding computationally very difficult to manage.
However, some attacks, called side-channel attacks (SCA), give an attacker the opportunity to retrieve secret information executed in the encryption algorithm based on information that leaks from its physical implementation, like timing information, power consumption, electromagnetic leaks, etc. . . .
There is thus a first need to provide a method for protecting the implementation of sensitive algorithms against such attacks.
Any algorithm can be represented as a graph of operations, or call graph, which is a directed graph wherein each node is a function and each edge is an intermediate variable (also referred as internal variable). This call graph can also be referred as a data flow graph, or a control flow graph.
A function can be a single operation, or a combination, linear or not, of operations. It is a straight-line piece of code without any jumps. When a function comprises a plurality of operands, it can be decomposed in a plurality of unary or binary operands.
Typical operations are those which can be implemented in a given technology. For instance, software programs can compute arithmetic and logic operations, such as additions (‘+’), or exclusive boolean OR (‘XOR’). Digital Signal Processors (DSP) or Field Programmable Gate Array (FPGA) can compute any function implemented in look-up-tables (LUT) or arithmetic operations using MAC units (Multiply-ACcumulate). Application Specific Integrated Circuits (ASIC) can take advantage of standard cell libraries to compute any type of operation.
A function can be expressed in a high level language, but can also be mapped as a sequence of operations. It is the compiler's role to transform such a function, potentially described in a high level language, into a machine language, optimising the processing times and resources consumption.
A graph representing an algorithm is a directed graph: each node or function has as many entering edges as input arguments and as many outgoing edges as output results. For example, if the function is a mere binary operation (with two arguments and one result), there are two inputs and one output.
The edges carry typed variables, which are passed from node to node. The type can be a byte, a 32-bit word, etc. . . .
It is an object of the invention to consider a sensitive algorithm, such as a cryptographic algorithm, described as a call graph and to transform the algorithm so as to protect it against side-channel attacks, regardless of the type of algorithms or of any considerations about its implementation.
To increase the robustness of an algorithm against side channel attacks, it is known to mask the sensitive data of the algorithm. One example of masking is based on secret sharing, which consists in splitting an initial variable in a plurality of new variables, so that the sum of the new variables yields the initial variable. The sum must be understood according to the underlying type of the variables. For instance, when the variable is a byte, the sum can be a bitwise XOR, or an addition modulo 256.
When the operations affecting the masked data are linear functions, the value of the masked output of the function can be calculated from the masked input. However, when the functions are non-linear (as for example power function, substitution box of cryptographic algorithm, . . . ), the mask calculation might be impossible. The mask must be removed at the input of the function, and a new mask inserted at the output of the function.
Various masking techniques are known, some of them are proven. They apply to straight line programs, which are linear call graphs. The masking consists in chaining operations, with, whenever appropriate, a random resharing (or refreshing) of the masks between operations. However, when the graph is not in straight line, some vulnerability might show up.
In M. Rivain and E. Prouff, Provably secure higher-order masking of AES, CHES 2010, pages 413-427, is presented a complete masked AES algorithm. In this paper, the masking of specific linear and non-linear functions is described, and the functions are chained in order to describe a complete AES algorithm. However, as indicated in J E. Coron, E. Prouff, M. Rivain and T. Roche, High-Order Side Channel Security and Mask Refreshing, FSE 2013, pages 11-13, even with an approach specifically dedicated to the AES algorithm, some implementation issues can arise. These implementation issues come from the reuse of some variables, hence the achieved security level decreases.
Thus, today, most of the masking implementations are done manually, which is prone to implementation errors (as for instance sensitive variables not being masked). Only a few studies are considering automatic masking.
Amongst these studies is the article of A. Moss, E. Oswald, Compiler assisted masking, Cryptographic Hardware and Embedded Systems, CHES 2012, p 58-75. In this article, sensitive data are annotated by the programmer, and their secrecy is treated as a value in a lattice allowing the compiler to propagate secrecy information through the program. Once compiled, the secret data never appear in plain text during the program execution, thereby guaranteeing the secrecy of the masked data, in particular against side channel attacks. The algorithm then performs a step of searching for sensitive information leakages in every value in the program, in particular in temporary variables introduced when converting expressions, and, when a leakage occurs, attempts to prevent the leakage using a set of program transformations.
The drawback of the solution exposed in this article is that it applies only to first-order Boolean masking schemes, and to straight-line code. Moreover, the step of searching for leakage and transforming the program to prevent such leakage is not bound to converge.
In Eldib H., Wang C., Synthesis of masking countermeasures against side channel attacks, Computer aided verification (pp. 114-130), Springer International Publishing, January 2014, it is proposed to mask an entire algorithm, including all the intermediate values. To cope with non-linear functions, it is proposed to determine a functionally equivalent linear function, and to verify that the function is equivalent for all possible inputs, and is perfectly masked.
This method has similar drawbacks as the method of Moss et al., as it only applies on call graphs of Boolean types, which reduces the scope to hardware implementations. In addition, the method follows a trial and error methodology, whose execution time is not guaranteed.
There is accordingly a more precise need for a fully automatic and robust method to transform an unprotected algorithm into a secure version of said algorithm.