Cryptoanalysis based on side-channel information has successfully been applied to attack cryptographic devices. These attacks exploit information leaked during the computation of cryptographic algorithms, such as timing information, power consumption, or electromagnetic emanations of the device. Kocher [13] has shown that the power consumption of unprotected cryptographic devices provides a side channel, which can be used with extraordinary simple equipment. Differential power analysis (DPA) allows the attacker to exploit correlations between the observable instantaneous power consumption and intermediate results involving the secret. During the last years it has become more and more obvious that differential power attacks are extremely difficult to protect against [6, 19, 3, 14].
The first class of ad-hoc approaches, called hardware countermeasures, tries to reduce the signal-to-noise ratio of the side-channel leakage and finally to bury the usable information in the noise. Hardware countermeasures include methods like detached power supplies [22], the addition of power noise generators, or the application of a probabilistic disarrangement of the times at which the attacked intermediate results are processed by using random delay insertions or randomization of the execution path. While such measures surely increase the experimental and computational workload of the attacker they do not render the attack infeasible. In practice, typically several countermeasures are combined [6, 3, 14]. This can reduce the correlation down to a level that makes a DPA virtually impossible. However, higher order differential attacks or the possibility of obtaining a spatial resolution of the power consumption by observing local electromagnetic emanations may again open a backdoor for professional attackers.
The second class of countermeasures aims at removing the root cause for side-channel leakage information. In standard CMOS style circuits the power consumption depends strongly on the processed data. In other logic styles, like sense amplifier based logic (SABL) [24], which is based on differential cascode voltage switching logic (DCVLS), the power consumption is data independent (if coupling effects are negligible). However in terms of area and power such circuit styles require more than twice as much area and power as unprotected CMOS circuits. Logic styles with a pre-charge and an evaluation phase are also basically two-cycle schemes in contrast to a standard CMOS design style, which allows an operation every clock cycle. It has also to be pointed out, that such design styles currently do not seem well suited to a semi-custom design flow, which is based on a high-level hardware description language and standard cell libraries. The wave dynamic differential logic style (WDDL) adopts the ideas of SABL, but is based on standard CMOS [25]. It overcomes the last problem, but at the costs of three times the area consumption and a two-cycle scheme.
The third class of measures counteracts DPA by randomizing intermediate results occurring during the execution of the cryptographic algorithm. The idea behind this approach is that the power consumption of operations on randomized data should not be correlated with the actual plain intermediate data. Algorithmic countermeasures in the context of symmetric ciphers based on secret sharing schemes have been independently proposed by Goubin and Paterin [11] and Chari et al. [5]. Approximately the same time masking at algorithm level for asymmetric ciphers has been developed [7, 18]. Messerges [16] introduced the idea of masking all data and intermediate values during an encryption operation. Akkar and Giraud [1] introduced masking methods for DES and AES with the fundamental contribution of a robust, albeit not perfect, masking for the non-linear parts. A suggested simplification proposed in [28] was recently found to be vulnerable to first order DPA attacks [2]. Cryptographic algorithms often combine Boolean functions (like logical XOR or AND operations) and arithmetic functions (operations in fields with characteristic bigger than two). Masking operations for these two types of functions are referred to as Boolean and arithmetic masking, respectively. This poses the problem of a secure conversion between the two types of masking in both directions [1]. It has also been noticed that the multiplicative masking in the AES leads to a problem with zero values, i.e. a zero byte will not get masked and will also be mapped to a zero byte by the S-box [10, 28]. The zero-value problem makes the original masked version of the AES vulnerable to DPA. As a consequence [26, 27] and [4] have proposed countermeasures, which would protect also against the zero-value problem.
Notably the latter three proposals apply masking no longer on the algorithm level, but at the level of logic gates. In an earlier work Messerges [17] already applied the idea of masking at the gate level and proposed to replace the multiplexer gate (MUX) used in the implementation of non-linear operations, like S-boxes, by a masked MUX gate (which in turn consists of three MUX gates). A theory of securing a circuit at the gate level against side-channel attacks (focused on probing) was developed in [12]. Masking at gate level leads to circuits where no wire carries a value, which is correlated to an intermediate result of the algorithm. Clearly this approach is more generic than the algorithmic approach. Masking at gate level is independent of the specific algorithm implemented. Once a secure masking scheme has been developed the generation of the masked circuit from the algorithm can be automated, and a computer program can convert the digital circuit of any cryptographic algorithm to a circuit of masked gates. This would also relief the authors or implementers of cryptographic algorithms from the complex task of elaborating a specific solution against side-channel leakage for each new implementation variant or algorithm. Various generic masking schemes have been proposed. These are either based on the MUX technique of [17], like [8, 9], or which use correction terms, e.g. for the AND gate [26]. The random switching logic (RSL) of [23] uses a random input per gate and introduces an enable signal, which forces the output to a definite value until all input signals are stable. Hence it is also a hidden two-cycle scheme, however, requiring a delicate adjustment of the timing of the enable signal.
In a recent publication Mangard, Popp, and Gammel [15] have shown that the security analyses of masking schemes that have been conducted so far were based on an implicit assumption, which does not hold in general: The input signals of almost any (masked) gate in a combinational CMOS circuit do not arrive at the same time. Therefore, the output of the gate possibly switches several times during one clock cycle. The transitions at the output of a gate, before the stable state right before the next clock edge is attained, are called glitches. Glitches are a typical phenomenon in CMOS circuits and extensively discussed in the literature on VLSI design (see e.g. [21]). Because a glitch can cause a full swing transition at the output of the gate, just like the “proper” transition to the final value, a glitch is not a negligible higher order effect. As made evident in [15] glitches do not just add a background noise due to uncorrelated switching activity.
Unfortunately, the dissipated energy of non-linear masked gates is correlated to the processed values whenever the input values do not arrive simultaneously (forcing the output of the gate to toggle several times). Hence glitches carry side-channel information and must be considered properly in the analysis of any secure masking scheme.