§1.1 Field of the Invention
The present invention concerns encryption, such as encryption using the Advanced Encryption Standard. In particular, the present invention concerns detecting errors in encryption operations.
§1.2 Background Information
Faults that occur in VLSI chips can broadly be classified into two categories: Transient faults that die away after sometime and permanent faults that do not die away with time but remain until they are repaired or the faulty component is replaced. The origin of these faults could be due to the internal phenomena in the system such as threshold change, shorts, opens etc. or due to external influences like electromagnetic radiation. These faults affect the memory as well as the combinational parts of a circuit and can only be detected using Concurrent Error Detection (CED). (See, e.g., S. Ozev, A. Orailoglu, “Cost-Effective Concurrent Test Hardware Design for Linear Analog Circuits” ICCD 2002, pp 258-264; and C. Metra, S. Francescantonio, G. Marrale, “On-Line Testing of Transient Faults Affecting Functional Blocks of FCMOS, Domino and FPGA-Implemented Self-Checking Circuits” DFT 2002 pp. 207-215, both incorporated herein by reference.) This is especially true for sensitive devices such as cryptographic chips. Hence, CED for cryptographic chips is growing in importance. Since cryptographic chips are a consumer product produced in large quantities, cheap solutions for CED are needed. CED for cryptographic chips has also a great potential for detecting deliberate fault injection attacks where faults are injected into a cryptographic chip to break the key. (See, e.g., D. Boneh, R. DeMillo and R. Lipton, “On the importance of checking cryptographic protocols for faults”, Proceedings of Eurocrypt, Lecture Notes in Computer Science vol 1233, Springer-Verlag, pp. 37-51, 1997; E. Biham and A. Shamir, “Differential Fault Analysis of Secret Key Cryptosystems”, Proceedings of Crypto, August 1997; J. Bloemer and J.-P. Seifert, “Fault based cryptanalysis of the Advanced Encryption Standard,” www.iacr.org/eprint/2002/075.pdf; Giraud, “Differential Fault Analysis on AES”, eprint.iacr.org/2003/008.ps; and G. Piret, J-J. Quisquater, “A Differential Fault Attack Technique against SPN Structures, with Application to the AES and KHAZAD,” CHES 2003, Springer Verlag LNCS 2779, each incorporated by reference.)
The most straightforward methods of performing CED are Hardware Redundancy and Time Redundancy. In Hardware Redundancy, multiple (≧2) copies of the algorithm are used concurrently to perform the same computation on the same data. At the end of each computation, the results are compared and any discrepancy is reported as an error. The advantage of this technique is that it has minimum error detection latency and both transient and permanent faults are detected. A drawback of this technique is that it entails ≧100% hardware overhead. In Time Redundancy, the same basic hardware is used to perform both the normal and re-computation using the same input data. The advantage of this technique is that it uses minimum hardware but the drawback is that it entails ≧100% time overhead. Also, another significant shortcoming is that it can only detect transient faults.
In 2001, Advanced Encryption Standard (AES) (See J. Daemen and V. Rijmen, “AES proposal: Rijndael”, csrc.nist.gov/CryptoToolkit/aes/rijndael/; incorporated herein by reference) was chosen as the FIPS standard to be a royalty-free encryption algorithm for use worldwide and offer security of a sufficient level to protect data for the next 20 to 30 years. Since then, it has been the most widely used, analyzed and attacked crypto algorithm. Differential Fault Attacks have been a popular method to attacks AES implementations and many CED techniques have been proposed to thwart such attacks. (See, e.g., J. Bloemer and J.-P. Seifert, “Fault based cryptanalysis of the Advanced Encryption Standard,” www.iacr.org/eprint/2002/075.pdf; Giraud, “Differential Fault Analysis on AES”, eprint.iacr.org /2003/008.ps; and G. Piret, J-J. Quisquater, “A Differential Fault Attack Technique against SPN Structures, with Application to the AES and KHAZAD,” CHES 2003, Springer Verlag LNCS 2779.) In the paper, R Karri, K. Wu, P. Mishra and Y. Kim, “Concurrent Error Detection of Fault Based Side-Channel Cryptanalysis of 128-Bit Symmetric Block Ciphers,” IEEE Transactions on CAD, December 2002 (incorporated herein by reference), a Register Transfer Level CED approach for AES that exploits the inverse relationship between the encryption and decryption at the algorithm level, round level and individual operation level was developed. This technique has an area overhead of 21% at the algorithm level, 18.9% at the round level and 38.08% at operation level.
Similarly, the time overhead is 61.15%, 26.55% and 23.56% respectively. In the paper G. Bertoni, L. Breveglieri, I. Koren and V. Piuri, “On the propagation of faults and their detection in a hardware implementation of the advanced encryption standard,” Proceedings of ASAP '02, pp. 303-312, 2002 (incorporated herein by reference), this inverse-relationship technique was extended to AES round key generation. A drawback of this approach is that it assumes that the AES crypto device operates in a half-duplex mode (i.e. either encryption or decryption but not both are simultaneously active).
In the paper G. Bertoni, L. Breveglieri, I. Koren, and V. Piuri, “Error Analysis and Detection Procedures for a Hardware Implementation of the Advanced Encryption Standard,” IEEE Transactions on Computers, vol. 52, No. 4, pp. 492-505, April 20 (incorporated herein by reference) a parity-based CED method for the AES encryption algorithm was presented. This technique has relatively high hardware overhead. The technique adds one additional parity bit per byte resulting in 16 additional bits for the 128-bit data stream. Each of the sixteen 8-bit×8-bit AES s-boxes is modified into 9-bit×9-bit S-Boxes more than duplicating the hardware for implementing the s-boxes. In addition, this technique adds one additional parity bit per byte to the outputs of the Mix-Column operation because Mix-Column does not preserve parity of its inputs at the byte-level. In this paper, we propose a time redundancy based CED technique for the AES which entails a low time overhead and can detect both transient and permanent faults. It makes use of invariances exhibited by AES. Invariances have been previously used for online testing. (See, e.g., Y. Makris, I. Bayraktaroglu, A. Orailoglu, “Invariance-Based On-Line Test for RTL Controller-Datapath Circuits” VTS 2000, incorporated herein by reference.)
§1.2.1 Advanced Encryption Standard
AES (J. Daemen and V. Rijmen, “AES proposal: Rijndael”, csrc.nist.gov/CryptoToolkit/aes/rijndael/) is an iterative block cipher with a variable block length. In this paper, we will consider a block length of 128 bits. AES encrypts a 128-bit input plain text into a 128-bit output cipher text using a user key using 10 almost identical iterative rounds. The 128-bit (or 16-byte) input and the 128-bit (or 16-byte) intermediate results are organized as a 4x4 matrix of bytes called the state X.
  X  =      [                                        x            0                                                x            4                                                x            8                                                x            12                                                            x            1                                                x            5                                                x            9                                                x            13                                                            x            2                                                x            6                                                x            10                                                x            14                                                            x            3                                                x            7                                                x            11                                                x            15                                ]  The four four-byte groups (x0, x4, x8, x12), (x1, x5, x9, x13), (x2, x6, x10, x14) and (x3, x7, x11, x15) form the four rows of the state (matrix) X.
AES encryption is shown in FIG. 1. It consists of the operations ByteSub, ShiftRow, MixColumn, and AddRoundKey. In the last round the MixColumn operation is not used.
§1.2.1.1 ByteSub Operations
All bytes are processed separately. For every byte not equal to 0=(0,0,0,0,0,0,0,0) first the inverse in GF(28) is determined. m(x)=s8+x4+x+1 is used as the modular polynomial for GF(28). The byte 0 is mapped to 0. Then a linear affine transformation is applied. Very often ByteSub is implemented using 16 copies of an 8-bit×8-bit Substitution-Box (S-Box). The result state is:
  Y  =      [                                        y            0                                                y            4                                                y            8                                                y            12                                                            y            1                                                y            5                                                y            9                                                y            13                                                            y            2                                                y            6                                                y            10                                                y            14                                                            y            3                                                y            7                                                y            11                                                y            15                                ]  
§1.2.1.2 ShiftRow Operations
The rows of the state are shifted cyclically byte-wise using a different offset for each row. Row 0 is not shifted, row 1 is cyclically shifted left 1 byte, row 2 is cyclically shifted left by 2 bytes and row 3 is cyclically shifted left 3 bytes. The result state is represented as Z:
  Z  =            [                                                  z              0                                                          z              4                                                          z              8                                                          z              12                                                                          z              1                                                          z              5                                                          z              9                                                          z              13                                                                          z              2                                                          z              6                                                          z              10                                                          z              14                                                                          z              3                                                          z              7                                                          z              11                                                          z              15                                          ]        =          [                                                  y              0                                                          y              4                                                          y              8                                                          y              12                                                                          y              5                                                          y              9                                                          y              13                                                          y              1                                                                          y              10                                                          y              14                                                          y              2                                                          y              6                                                                          y              15                                                          y              3                                                          y              7                                                          y              11                                          ]      
§1.2.1.3 MixColumn Operations
The elements of the columns of the state are considered as the coefficients of polynomials of maximal degree 3. The coefficients are considered as elements of GF(28). These polynomials are multiplied modulo the polynomial x4+1 with a fixed polynomial c(x)=(03)x3+(01)x2+(01)x+(02). The coefficients of this polynomial given in hexadecimal representation are also elements of GF(28). The MixColumn operation on a column zr=[z0,z1,z2,z3]T of the state into the column uT=[u0,u1,u2,u3]T can be formally described by Equation (1) where the constant elements of the matrix C and of the vectors zT and uT as well as the multiplication and the addition are in GF(28). The polynomial x=x8+x4+x+1 is used as the modular polynomial. The elements of matrix C are 01, 02 and 03.
                              [                                                                      u                  0                                                                                                      u                  1                                                                                                      u                  2                                                                                                      u                  3                                                              ]                =                              [                                                            02                                                  03                                                  01                                                  01                                                                              01                                                  02                                                  03                                                  01                                                                              01                                                  01                                                  02                                                  03                                                                              03                                                  01                                                  01                                                  02                                                      ]                    ×                      [                                                                                z                    0                                                                                                                    z                    1                                                                                                                    z                    2                                                                                                                    z                    3                                                                        ]                                              (        1        )            The MixColumn operation can be implemented by a simple linear network of exclusive or elements.
§1.2.1.4 AddRoundKey Operations
AddRoundKey operation is a bit-wise exclusive-or of the 128-bit round key (matrix K) with the 128-bit state. The result state is:
  A  =            [                                                  u              0                                                          u              4                                                          u              8                                                          u              12                                                                          u              1                                                          u              5                                                          u              9                                                          u              13                                                                          u              2                                                          u              6                                                          u              10                                                          u              14                                                                          u              3                                                          u              7                                                          u              11                                                          u              15                                          ]        ⊕          [                                                  k              0                                                          k              4                                                          k              8                                                          k              12                                                                          k              1                                                          k              5                                                          k              9                                                          k              13                                                                          k              2                                                          k              6                                                          k              10                                                          k              14                                                                          k              3                                                          k              7                                                          k              11                                                          k              15                                          ]      
§1.2.1.5 Invariance Properties of AES
Desmedt, Le, Sparr and Wemsdorf (See, e.g., Y. Desmedt, T. Le, R. Sparr, R. Wemsdorf, “Cyclic Properties of AES round Functions”, Invited Talk, 4th Conference on the AES, May 2004; and Tri Van Le “Novel Cyclic Properties of AES”, http://eprint.iacr.org/2003/108/, both incorporated herein by reference.) discovered that AES exhibits invariance properties. AES round can be represented as A[RKi](M(S(B(p)))) where p is the 128-bit input to the round, B is the ByteSub operation, S is the ShiftRow operation, M is the MixColumn operation and A[RKi] is the AddRoundKey operation as defined in section-2.
Consider a part of the AES round as shown in FIG. 2 where the results are observed at the output of the MixColumn function. From the 128-bit input to the round, each input byte passes through an S-Box in the ByteSub function. The resulting 128-bit ByteSub output then passes through the ShiftRow and MixColumn functions. Consider w, x, y, z, a, b, c and d to be byte quantities. From the results in Y. Desmedt, T. Le, R. Sparr, R. Wemsdorf, “Cyclic Properties of AES round Functions”, Invited Talk, 4th Conference on the AES, May 2004; and Tri Van Le “Novel Cyclic Properties of AES”, http://eprint.iacr.org/2003/108/:M(S(B({w,w,w,w,w,w,w,w,w,w,w,w,w,w,w,w})))={a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a}  (2)M(S(B({w,x,w,x,w,x,w,x,w,x,w,x,w,x,w,x})))={a,b,a,b,a,b,a,b,a,b,a,b,a,b,a,b}  (3)M(S(B({w,x,y,z,w,x,y,z,w,x,y,z,w,x,y,z})))={a,b,c,d,a,b,c,d,a,b,c,d,a,b,c,d}  (4)M(S(B({w,x,w,x,y,z,y,z,w,x,w,x,y,z,y,z})))={a,b,a,b,c,d,c,d,a,b,a,b,c,d,c,d}  (5)M(S(B({w,x,y,z,y,z,w,x,w,x,y,z,y,z,w,x})))={a,b,c,d,c,d,a,b,a,b,c,d,c,d,a,b}  (6)
These invariance properties of AES are being used to investigate its weaknesses.