The invention generally relates to data processing, and more particularly to a modular multiplication method and device.
Modular multiplication is used in a number of fields and in particular in data securing systems, such as cryptographic systems. Cryptographic systems use cryptographic schemes which lie on modular multiplications between long integer multiplicands. The performance of the overall cryptographic system closely depends on the optimization of the modular multiplication.
Cryptographic systems are widely used to ensure the privacy and authenticity of messages communicated over insecure channels, in a variety of products such as embedded devices, smartcards. Asymmetric cryptographic systems (also referred to as “public-key” cryptographic systems) use a pair of keys comprising a public key and a private key to encrypt and decrypt a message so that to ensure that the message arrives securely without being intercepted by an unauthorized user.
An asymmetric cryptographic system may be subject to different types of attacks intended to access the private key. A first type of attacks is based on searching for the private key. The computational complexity of such attack depends on the number of bits comprised in the private key. Hence, an asymmetric cryptographic system may be protected from this type of attack by selecting a private key made of a large number of bits. However, other types of attacks target cryptographic systems with keys comprised of a large number of bits. These attacks use mathematical or analytical optimization to try to reduce the search space.
There also exist indirect attacks implemented against robust cryptographic systems based on analysis of the behavior of a cryptographic device, the cryptographic device being seen as a black box containing a known algorithm and an unknown key. Such indirect attacks comprise “side-channel attacks” or SCA which use information (“side-channel information” such as the power consumption of the cryptographic device) observed during the execution of the cryptography algorithm to retrieve some secret information embedded in a cryptographic device (Paul Kocher. Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In N. Koblitz, editor, Advances in Cryptology—CRYPTO'96, volume 1109 of Lecture Notes in Computer Science, pages 104-113. Springer-Verlag, 1996).
A particular type of side-channel attack that proved to be very efficient is realized through the injection of deliberate (malicious) faults into a cryptographic device and the observation of the corresponding erroneous outputs (differential fault analysis (DFA)). Such attacks limit the number of experiments needed to obtain the bits of the secret key.
A variety of countermeasure systems and methods have emerged to protect asymmetric cryptographic systems against such indirect attacks. Indirect attacks may be directed to a number of cryptosystems such as RSA cryptosystems, public key cryptosystems, or elliptic curve cryptosystems. In particular, embedded devices based on elliptic curve cryptography such as mobile devices or smart cards, are particularly sensitive to side-channel attacks (SCA). Elliptic Curve Cryptosystems (ECC) are now considered as a powerful and popular alternative to RSA cryptosystems because they require shorter key sizes than RSA and improve the security level.
In many cases, public key encryption is implemented by arithmetic operations using a multiple-length odd integer as a modulus. The speed of this arithmetic operation influences the performance of the system. One method of modular multiplication that is particularly suitable for cryptographic systems is the Montgomery Modular Multiplication. The Montgomery Modular Multiplication (referred to hereinafter as MMM) consists in an operation which allows to compute arithmetic operations in finite fields or rings. It possesses a fast implementation, which makes it preferred over straightforward modular multiplication when many multiplications are needed, like in most cryptographic asymmetric algorithms. For instance, RSA (Rivest, Shamir, Adleman) computes a modular exponentiation over the multiplicative ring, the modular exponentiation thus forming a sequence of multiplications. The exponent is usually large, typically 2048 bits or more, hence a few thousands multiplications per RSA.
Another well known cryptographic scheme which uses the modular multiplication is the Elliptic Curves Cryptography (ECC) which consists in operations on a finite q, where q is the power of a prime. For each operation on the curve (doubling and addition—since the curve is a group), a few (typically about ten) multiplications are required. Furthermore, the number of operations on the curve is a few hundreds, hence a total of a few thousands multiplications per ECC computation.
A naive implementation of the product of two elements x and y in N consists in computing xy as if x and y were integers, and then reduce the result modulo N. One possibility is to subtract N to xy until the result is strictly smaller than N. However, this algorithm is highly inefficient. A faster alternative consists in dividing xy by N, and to keep only the remainder. But such Euclidean division is very costly (from 5 to 20 more than a product). The MMM allows a fast computation for modular multiplications which avoids divisions. It is adapted to the architecture of computers. Defining the radix R as the smallest power of 2w greater than the modulus N, for example, when N fits on 2048 bits, then R=(2w)64=264w=22048 if w=32 bits.
The MMM requires to plunge the arguments into the so-called Montgomery representation. The MMM computes an actual modular multiplication within this representation, and does it efficiently as, the multiplication and the reduction can be interleaved, thereby keeping intermediate products of small size. With a MMM, the reduction (interleaved or not) is performed modulo a power of 2 instead of the initial module. A reduction modulo a power of 2 consists of a shift of bits, instead of performing a real division.
A MMM depends on one parameter, called the radix R. For efficiency reasons, R is generally chosen to be a large power of two. Depending on the relative value of R and the modulus N, the MMM can require an extra reduction step.
Typically, for moduli of cryptographic size, such as 1024, 2048, 4096 bits for RSA, or 192, 256, 512 for ECC, the value of the radix R is chosen to be the smallest power of two (strictly) larger than the modulus, for performances reasons.
A number in a finite ring or field is represented in machines as a sequence of digits, which are called limbs, and that correspond to the machine natural bitwidth (for instance 8, 16, 32 or 64 bits). Further, the moduli sizes are all multiples of the limb bitwidths. Accordingly, if the radix R is chosen to be twice the minimal value (R being a power of two), then all the computations must be carried out on integers which have an extra limb. This causes more numerous operations to be carried out.
With this choice of R, the MMM algorithm ends by a final test (also referred to as “extra-reduction”): it either returns the correct result or the correct result plus the modulus and hence shall be reduced. No such “extra-reduction” exists if R is not the smallest power of two greater than the modulus, but the second, third, etc. smallest power of two greater than the modulus.
However, the presence of this extra-reduction has some important drawbacks:                First, the extra-reduction incurs an average increase of the computation time (which is nonetheless negligible compared with the addition of an extra limb in the computation);        Second, and most importantly, this extra-reduction causes a data-leakage, which raises a significant security issue.        
The presence or the absence of the extra-reduction does depend on the inputs of the MMM. So, in this case, the user is potentially an attacker, who aims at recovering a private key. For example, if the attacker is able to predict internal values of the algorithm by guessing some bits of the private key, then the actual value used can be determined by observing the effect on the extra-reduction. The internal value which matches consistently (over several encryptions) the presence/absence of the extra-reduction is the most likely.
The observation of the extra-reduction can be done by several means, such as by monitoring the overall duration of the computation. If there is an extra-reduction, then this overall duration is larger. Nonetheless, the overall duration only yields an approximate idea of the presence of the extra-reduction, because the overall duration is actually affected by several (potentially thousands) extra-reductions. Therefore, a direct observation of the presence of an extra-reduction is generally performed. It can require a physical access to the machine, and the usage of a magnetic probe (or a means to spy on the instant current consumption). The extra-reduction radiation or consumption profile can thus be characterized, and later recognized by “pattern matching” in a fresh trace.
Often, the implementation of asymmetric algorithms is blinded by randomizing the processed data before computation, and de-randomizing the data after computation and just before outputting the result. Such countermeasure prevents an attacker from guessing internal values and as a result the previous attack based on prediction of internal values is unpractical.
However, it has been observed that sequences of multiplications can still be guessed, although they manipulate unknown data.
For example, considering the implementation of a cryptographic algorithm where the output of one MMM feeds another MMM (which happens to be a square, i.e., an MMM where both inputs are equal) if one key bit is equal to 1, or does not feed the square otherwise. It has been analyzed that if that key bit is equal to 1, i.e. one MMM feeds another one as a square) then if the first MMM has an extra-reduction, then it is very likely that the second MMM does not have extra-reduction and vice-versa. It has been thus shown that there exists a negative correlation between the presence of an extra-reduction on the two successive MMM. Accordingly, if the two MMM do not follow (i.e., they process independent data), then the presence of the extra-reductions is decorrelated.
Therefore, the presence of an extra-reduction can allow an attacker to extract a key, even in the presence of countermeasures (data blinding, and even regular exponentiation schemes).
There is accordingly a need for improved methods and devices for protecting asymmetric cryptographic systems against attack based on the observation of extra-reduction.