Elliptic curve cryptosystems (ECC) are public-key cryptosystems that have attracted increasing attention in recent years due to their shorter key length requirement in comparison with other public-key cryptosystems such as RSA.
Public-key cryptosystems make use of a pair of keys, called public and private keys, to perform cryptographic operations such as encryption/decryption of data and signing/verification of digital signatures. In particular for ECC, private keys are scalar values that are kept in secret, and public keys are points on the elliptic curve that are made public. Given a secret scalar d and points P and dP on a elliptic curve, where dP is a multiple of the point P, the elliptic curve discrete logarithm problem (ECDLP) is defined as the problem of determining d, with P and dP known.
ECC can be defined over different finite fields. Most important finite fields used to date to implement this cryptosystem have been binary, prime and extension fields. Prime fields are denoted by Fp, where p is a large prime and also represents the number of elements of the field.
For the case of prime fields, the generic equation to represent an elliptic curve is given by:E:y2=x3+ax+b Where: a,bεFp and Δ=4a3+27b2≠0
Other variants of elliptic curve forms that also use prime fields can be found in the literature. Some examples are: Hessian and Jacobi forms, elliptic curves of degree ⅔ isogenies, among others.
The central and most time-consuming operation in ECC is scalar multiplication, generically represented by dP. Computing this operation involves performing addition of points, and doubling, tripling or quintupling (or similar) of a point. These operations are referred to as ECC point operations and their efficient execution is fundamental to the acceleration of the computation of scalar multiplication.
Side-channel information, such as power dissipation and electromagnetic emission, leaked by real-world devices has been shown to be highly useful for revealing private keys and effectively breaking the otherwise mathematically-strong ECC cryptosystem.
There are two main strategies to these attacks: simple (SSCA) and differential (DSCA) side-channel attacks. SSCA is based on the analysis of a single execution trace of a scalar multiplication to guess the secret key by revealing the sequence of operations used in the execution of ECC point arithmetic.
Extensive research has been carried out to yield effective countermeasures to deal with SSCA. Among them, side-channel atomicity dissolves point operations into small homogenous blocks, known as atomic blocks, which cannot be distinguished from one another through simple side-channel analysis because each one contains the same pattern of basic field operations. Furthermore, atomic blocks are made sufficiently small to make this approach inexpensive. For example the structure M-A-N-A (field multiplication, addition, negation, addition) has been proposed to build SSCA-protected point operations over prime fields.
However, the main drawback of the traditional M-A-N-A structure is that it relies on the assumption that field multiplication and squaring are indistinguishable from each other. In software implementations, timing and power consumption have been shown to be quite different for these operations, making them directly distinguishable through power analysis. Hardware platforms can be thought to be invulnerable to this attack when one hardware multiplier executes both field squarings and multiplications. However, some studies suggest that higher-order DSCA attacks can reveal differences between those operations by detecting data dependent information through observation of multiple sample times in the power trace.
In recent years a new paradigm has arisen in the design concept with the appearance of multiprocessor/parallel architectures, which can execute several operations simultaneously. This topic is becoming increasingly important since single processor design is reaching its limit in terms of clock frequency.
Similarly to other systems, ECC can be adapted to parallel architectures at different algorithmic levels. In particular, efforts to parallelize ECC formulae at the point arithmetic level have been shown to significantly reduce the time-complexity of scalar multiplication. However, the high number of expensive multiplications appearing in current point formulae limits the acceleration possible by taking advantage of multiple processing units in parallel implementations. In fact, given the fixed number of field squarings and multiplications in a given ECC point operation, the number of processing units that can be used effectively is limited to a maximum of 3.
Therefore there is a need for improving ECC point arithmetic to further accelerate and effectively protect scalar multiplication on elliptic curve cryptosystems over prime fields.