Cryptographic algorithms for processing digital data are well-known. Such algorithms may include encryption and decryption algorithms, algorithms to digitally sign data, algorithms to generate message authentication codes, algorithms to authenticate or verify the origin or integrity of data, etc.
An example of a cryptographic algorithm that might be used to encrypt/decrypt digital data is the Advanced Encryption Standard (AES), which is a well-known encryption algorithm, described in Federal Information Processing Standards Publication 197 (found at http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf), the entire disclosure of which is incorporated herein by reference. AES is a symmetric block cipher, where the size of an input block is 128 bits and the size of the corresponding output block is also 128 bits. There are three different variations of AES, known as AES-128, AES-192 and AES-256: for AES-n, the size of the cryptographic key is n bits.
The AES algorithm maintains a “state”, which is a 4×4 matrix S, each element of the matrix S being a byte. Let the element at row r and column c of the state S be represented by S[r,c] (0≤r<4 and 0≤c<4). An input block of data that is to be processed comprises 16 bytes, in[j] (0≤j<16). The state S is initialized by setting S[r,c]=in[r+4c] (0≤r<4 and 0≤c<4). The result of processing the input block of data is an output block of data that also comprises 16 bytes, out[j] (0≤j<16). At the end of the processing, the output block of data is formed from the state S by setting the output block of data according to out[r+4c]=S[r,c] (0≤r<4 and 0≤c<4). Each processing step or operation of the AES algorithm operates on the current state S, with the state S being modified at each step so as to move it from representing the input block of data to the output block of data. In the following, for each step or function/operation carried out when performing the AES algorithm, the result on the element S[r,c] of the state S of performing that step or applying that function/operation shall be represented by S′[r,c] (0≤r<4 and 0≤c<4).
The AES algorithm involves a number, Nr, of “rounds”. For AES-128, Nr=10; for AES-192, Nr=12; for AES-256, Nr=14. The rounds shall be described shortly.
A key expansion routine is used to generate a key schedule from an initial cryptographic key K. The key schedule comprises Nr+1 so-called “round keys” RKj (0≤j≤Nr), each round key being 128 bits. The details of the key expansion routine are not important for this disclosure and they shall, therefore, not be described in more detail herein. For more detail on this, see section 5.2 of Federal Information Processing Standards Publication 197.
In AES, bytes are viewed as elements of the field GF(28), where multiplication in GF(28) is modulo the irreducible polynomial x8+x4+x3+x+1.
FIG. 1 of the accompanying drawings provides an overview of encryption 100 using the AES algorithm.
The state S is initialized using an input block of data 110—data in[j] (0≤j<16)—as described above.
Next, the state S is processed by an “AddRoundKey” function 120, using the round key RK0.
Next, rounds 1, 2, . . . , Nr−1 are performed, one after the other. For round R (1≤R<Nr), the Rth round involves:                (a) processing the state S using a “SubBytes” function 130, followed by        (b) processing the state S using a “ShiftRows” function 140, followed by        (c) processing the state S using a “MixColumns” function 150, followed by        (d) processing the state S using the AddRoundKey function 120, using the round key RKR.        
Finally, the Nrth round is performed, which involves:                (a) processing the state S using the SubBytes function 130, followed by        (b) processing the state S using the ShiftRows function 140, followed by        (c) processing the state S using the AddRoundKey function 120, using the round key RKNr.        
Thus, the Nrth round is the same as the previous Nr−1 rounds, except that it does not include the MixColumns function 150.
An output block of data 160—data out[j] (0≤j<16)—can then be formed from the state S as described above.
The AddRoundKey function 120 involves XOR-ing the bytes of the current round key RKR being used (0≤R≤Nr) with the bytes of the state S. In particular, if the round key RKR is a series of bytes k[j] (0≤j<16), then element S[r,c] of the state S is XOR-ed with byte k[r+4c] (0≤r<4 and 0≤c<4), so that the element S[r,c] of the state S becomes S′[r,c]=S[r,c]⊕k[r+4c].
The SubBytes function 130 operates on each of the 16 bytes of the state S separately as follows. The element S[r,c] (0≤r<4 and 0≤c<4) is viewed as an element of GF(2) and its multiplicative inverse in GF(28) is determined. If we represent this inverse as a byte b that has bits b7, b6, . . . , b1, b0 (running from most to least significant bit), and if the result of applying the SubBytes function to the element S[r,c] (i.e. the byte S′[r,c]) is a byte that has bits c7, c6, . . . , c1, c0 (running from most to least significant bit), then S′[r,c] may be calculated as:
      [                                        c            0                                                            c            1                                                            c            2                                                            c            3                                                            c            4                                                            c            5                                                            c            6                                                            c            7                                ]    =                    [                                            1                                      0                                      0                                      0                                      1                                      1                                      1                                      1                                                          1                                      1                                      0                                      0                                      0                                      1                                      1                                      1                                                          1                                      1                                      1                                      0                                      0                                      0                                      1                                      1                                                          1                                      1                                      1                                      1                                      0                                      0                                      0                                      1                                                          1                                      1                                      1                                      1                                      1                                      0                                      0                                      0                                                          0                                      1                                      1                                      1                                      1                                      1                                      0                                      0                                                          0                                      0                                      1                                      1                                      1                                      1                                      1                                      0                                                          0                                      0                                      0                                      1                                      1                                      1                                      1                                      1                                      ]            ⁡              [                                                            b                0                                                                                        b                1                                                                                        b                2                                                                                        b                3                                                                                        b                4                                                                                        b                5                                                                                        b                6                                                                                        b                7                                                    ]              +          [                                    1                                                1                                                0                                                0                                                0                                                1                                                1                                                0                              ]      
The SubBytes function 130 is often implemented simply by a lookup table. In particular, for 0≤r<4 and 0≤c<4, if S[r,c]=16u+v for integer values 0≤u,v<16, then the application of the SubBytes function 130 to S[r,c] changes the value S[r,c] to the value S′[r,c] given at row u and column v of Table 1 below. The values in Table 1 are in hexadecimal.
TABLE 1v0123456789101112131415u0637c777bf26b6fc53001672bfed7ab761ca82c97dfa5947f0add4a2af9ca472c02b7fd9326363ff7cc34a5e5f171d83115304c723c31896059a071280e2eb27b275409832c1a1b6e5aa0523bd6b329e32f84553d100ed20fcb15b6acbbe394a4c58cf6d0efaafb434d338545f9027f503c9fa8751a3408f929d38f5bcb6da2110fff3d28cd0c13ec5f974417c4a77e3d645d1973960814fdc222a908846eeb814de5e0bdb10e0323a0a4906245cc2d3ac629195e47911e7c8376d8dd54ea96c56f4ea657aae0812ba78252e1ca6b4c6e8dd741f4bbd8b8a13703eb5664803f60e613557b986c11d9e14e1f8981169d98e949b1e87e9ce5528df158ca1890dbfe6426841992d0fb054bb16
Other ways of representing the SubBytes function 130 are possible.
The ShiftRows function 140 cyclically shifts the bytes of the last three rows of the state S. In particular, for row r of the state S (1≤r<4), the elements of row r are cyclically shifted by r positions to the left, i.e. the application of the ShiftRows function 140 to S[r,c] sets the value S[r,c] to the value S′[r,c] given by S′[r,c]=S[r,(c+r)(mod4)] (for 0≤r<4 and 0≤c<4).
With the MixColumns function 150, each column of the state S is processed by multiplying that column by a particular matrix. In particular, for 0≤c<4, the MixColumns function 150 operates on the cth column according to:
      (                                                      S              ′                        ⁡                          [                              0                ,                c                            ]                                                                                      S              ′                        ⁡                          [                              1                ,                c                            ]                                                                                      S              ′                        ⁡                          [                              2                ,                c                            ]                                                                                      S              ′                        ⁡                          [                              3                ,                c                            ]                                            )    =            (                                    2                                3                                1                                1                                                1                                2                                3                                1                                                1                                1                                2                                3                                                3                                1                                1                                2                              )        ⁢          (                                                  S              ⁡                              [                                  0                  ,                  c                                ]                                                                                        S              ⁡                              [                                  1                  ,                  c                                ]                                                                                        S              ⁡                              [                                  2                  ,                  c                                ]                                                                                        S              ⁡                              [                                  3                  ,                  c                                ]                                                        )      where: multiplication by 1 means no change; multiplication by 2 means shifting to the left; and multiplication by 3 means shifting to the left and then XOR-ing the with the initial un-shifted value. Here, “shift” means a shift of the binary representation of the respective value to the left, as is known in the art (so that, for example, the binary value 10110011 becomes 101100110). After a shifting, the shifted value should be XOR-ed with 0x11B if the shifted value is larger than 0xFF.
Other ways of representing the MixColumns function 150 are possible. For example, the elements of the cth column of the state S may be treated as coefficients of a four-term polynomial over GF(28), with this polynomial then being multiplied modulo x4+1 by the polynomial 3x3+x2+x+2—the coefficients of the resultant polynomial then form the updated elements of the cth column of the state S.
FIG. 2 of the accompanying drawings provides an overview of decryption 200 using the AES algorithm.
Each of the AddRoundKey function 120, the SubBytes function 130, the ShiftRows function 140, and the MixColumns function 150 is invertible, as set out below.
The inverse of the AddRoundKey function 120, called InvAddRoundKey 220, is exactly the same as the AddRoundKey function 120.
The inverse of the SubBytes function 130, called InvSubBytes 230, can be implemented using the inverse of the transformation set out above in the description of the SubBytes function 130, or using a lookup table given by Table 2 below. The values in Table 2 are in hexadecimal. In particular, for 0≤r<4 and 0≤c<4, if S[r,c]=16u+v for integer values 0≤u,v<16, then the application of the InvSubBytes function 230 to S[r,c] changes the value S[r,c] to the value S′[r,c] given in Table 2 below at row u and column v.
TABLE 2v0123456789101112131415u052096ad53036a538bf40a39e81f3d7fb17ce339829b2fff87348e4344c4dee9cb2547b9432a6c2233dee4c950b42fac34e3082ea16628d924b2765ba2496d8bd125472f8f66486689816d4a45ccc5d65b69256c704850fdedb9da5e154657a78d9d84690d8ab008cbcd30af7e45805b8b345067d02c1e8fca3f0f02c1afbd0301138a6b83a9111414f67dcea97f2cfcef0b4e673996ac7422e7ad3585e2f937e81c75df6e1047f11a711d29c5896fb7620eaa18be1b11fc563e4bc6d279209adbc0fe78cd5af4121fdda8338807c731b11210592780ec5f1360517fa919b54a0d2de57a9f93c99cef14a0e03b4dae2af5b0c8ebbb3c8353996115172b047eba77d626e169146355210c7d
The inverse of the ShiftRows function 140, called InvShiftRows 240, cyclically shifts the bytes of the last three rows of the state S. In particular, for row r of the state S (1≤r<4), the elements of row r are cyclically shifted by r positions to the right, i.e. the application of the InvShiftRows function 240 to S[r,c] sets the value S[r,c] to the value S′[r,c] given by S′[r,c]=S[r,(c−r)(mod4)] (for 0≤r<4 and 0≤c<4). Note that, for 0≤r<4, this is equivalent to cyclically shifting the elements of the rth row (4−r)mod4 positions to the left.
For the inverse of the MixColumns function 150, called InvMixColumns 250, each column of the state S is processed by multiplying the column by a particular matrix. In particular, for (0≤c<4), the MixColumns function 150 operates on the cth column according to:
      (                                                      S              ′                        ⁡                          [                              0                ,                c                            ]                                                                                      S              ′                        ⁡                          [                              1                ,                c                            ]                                                                                      S              ′                        ⁡                          [                              2                ,                c                            ]                                                                                      S              ′                        ⁡                          [                              3                ,                c                            ]                                            )    =            (                                    e                                b                                d                                9                                                9                                e                                b                                d                                                d                                9                                e                                b                                                b                                d                                9                                e                              )        ⁢          (                                                  S              ⁡                              [                                  0                  ,                  c                                ]                                                                                        S              ⁡                              [                                  1                  ,                  c                                ]                                                                                        S              ⁡                              [                                  2                  ,                  c                                ]                                                                                        S              ⁡                              [                                  3                  ,                  c                                ]                                                        )      where: multiplication by e means shifting to the left, XOR-ing with the initial un-shifted value, shifting to the left again, XOR-ing with the initial un-shifted value, and shifting to the left again; multiplication by b means shifting to the left, shifting to the left again, XOR-ing with the initial un-shifted value, shifting to the left again, and XOR-ing with the initial un-shifted value; multiplication by d means shifting to the left, XOR-ing with the initial un-shifted value, shifting to the left again, shifting to the left again, and XOR-ing with the initial un-shifted value; and multiplication by 9 means shifting to the left, shifting to the left again, shifting to the left again, and XOR-ing with the initial un-shifted value. After a shifting, the shifted value should be XOR-ed with 0x11 B if the shifted value is larger than 0xFF.
Again, a polynomial representation may be used to implement the InvMixColumns 250 function. In particular, the elements of the cth column of the state S may be treated as coefficients of a four-term polynomial over GF(28), with this polynomial then being multiplied modulo x4+1 by the polynomial (b)x3+(d)x2+(9)x+(e), where the coefficients of this polynomial are in hexadecimal—the coefficients of the resultant polynomial then form the updated elements of the cth column of the state S.
Thus, decryption of a block of data can be performed by applying the InvAddRoundKey function 220, the InvSubBytes function 230, the InvShiftRows function 240, and the InvMixColumns function 250 in the reserve of the order, set out in FIG. 1, of their counterpart functions, using the same key schedule as for encryption. However, as set out in section 5.3.5 of Federal Information Processing Standards Publication 197, and as shown in the FIG. 2, it is possible to perform decryption 200 of a block of data 210 to form an output block of data 260 using the same order of the functions set out in FIG. 1 (but with the functions in FIG. 1 replaced in FIG. 2 by their inverses), but with the key schedule modified to produce a corresponding decryption key schedule for the purposes of decryption (the round keys for the decryption 200 being denoted RK′R in FIG. 2).
The skilled person will appreciate that any further details for the AES algorithm can be found in Federal Information Processing Standards Publication 197 and that the above description is provided to assist the reader (who is assumed to be knowledgeable about the AES algorithm).
Although the AES algorithm has been described here in detail by way of example, the skilled person will appreciate that there are numerous other cryptographic algorithms that might be used to process digital data. The skilled person will be assumed to be knowledgeable about the working, operation and implementation of such other cryptographic algorithms.
It is known that various functions, such as each round (or part of a round) in AES, can be implemented using mapping tables or look-up tables instead of an explicit calculation. The lookup table for a function will generally contain all possible output values for the function indexed against (or associated with) the input value(s) for the function that provide each output value. The look-up table is commonly accompanied with some code which will accept input value(s) and provide an output value based on the contents of the look-up table. The code will use the input value(s) it receives to retrieve (or ‘lookup’) from the table an output value that corresponds to the input value(s) by using the indexing or association between the input values and the output values that is present in the table.
When a program (or software) is being executed by a processor, the environment in which the execution is being performed is a so-called “white-box” environment if the user (or a third party) has access to the processing so that the user can observe and alter the execution of the program (e.g. by running a suitable debugger)—such alterations could be changes to the process flow or changes to the data being processed. This observation and/or alteration of the execution of the program may be referred to as tampering. The user may observe or alter (or in other words tamper with) the execution of the program in order to satisfy their own aims or goals, which may not be possible to satisfy if the program were to run normally without being tampered with. Such tampering to achieve a particular aim or goal may be referred to as goal-directed tampering. Goal-directed tampering may involve, for example, observing and/or altering the execution of a program being run in a white-box environment in order to obtain or deduce a cryptographic key that is used by the program to process digital data (e.g. a decryption key for decrypting data).
Various techniques are known for protecting the integrity of a data processing software application (or program or system) which is being run in a white-box environment. These techniques generally aim to hide the embedded knowledge of the application by introducing additional complexity and/or randomness in the control and/or data paths of the software application. This additional complexity and/or randomness has the effect of obscuring or obfuscating the information (or data) or execution path of the software application. As a result of this obfuscation, it becomes more difficult to extract information from the application by code inspection and it is more difficult to find and/or modify the code that is associated with particular functionality of the program. It is therefore much more difficult for an attacker with access to the program running in a white-box environment to retrieve sensitive data or alter the operation of the program in order to meet their own goals by tampering with the execution of the program. As such, the ability of the attacker to carry out goal-directed tampering is reduced. These techniques which aim to reduce the ability of an attacker to carry out goal-directed tampering may be considered to improve the tamper-resistance of the software. If it is sufficiently difficult for an attacker to carry out goal-directed tampering, then, for any practical purposes, the software may be considered to be tamper-resistant, even if theoretically tampering is still possible.
An exemplary technique for improving the tamper-resistance of software can be found in “White-Box Cryptography and an AES Implementation”, by Stanley Chow, Philip Eisen, Harold Johnson, and Paul C. Van Oorschot, in Selected Areas in Cryptography: 9th Annual International Workshop, SAC 2002, St. John's, Newfoundland, Canada, Aug. 15-16, 2002, the entire disclosure of which is incorporated herein by reference. “White-Box Cryptography and an AES Implementation” discloses an approach to protecting the integrity of a so cryptographic algorithm by creating a key-dependent implementation of the algorithm using a series of lookup tables. The key(s) are embedded in the implementation by partial evaluation of the algorithm with respect to the key(s). Partial evaluation means that expressions involving the key are evaluated as much as reasonably possible, and the result is put in the code rather than the full expressions. This means that the implementation is specific to particular key(s) and that key input is unnecessary in order to use the key-dependent implementation of the algorithm. It is therefore possible to distribute a key-dependent implementation of an algorithm, which may be user-specific, for encrypting or decrypting content or data instead of distributing keys, which may be user-specific. The key-dependent implementation is created so as to hide the key(s) by: (1) using tables for compositions rather than individual steps; (2) encoding these tables with random bijections; and (3) extending the cryptographic boundary beyond the cryptographic algorithm itself further out into the containing application, thereby forcing attackers to understand significantly larger code segments to achieve their goals.
FIG. 3 of the accompanying drawings illustrates an implementation 310 of an exemplary function X which receives or obtains data d at, or via, an input 312 to the function X, processes the data d to generate processed data X(d), and provides the processed data X(d) via an output 316. The implementation 310 of the function might involve one or more processing steps which comprise one or more of instructions, code, logic, lookup tables or any combination thereof in order to provide the processed data X(d) at the output 316 in response to receiving data d at the input 312. FIG. 3 further illustrates an encoded or obfuscated implementation 320 of the function X—this implementation 320 comprises an obfuscated function X′. In the implementation 320, the function X is obfuscated to form the function X′ by using an input encoding F and an output encoding G. The obfuscated function X′ receives or obtains an encoded representation F(d) of the input data d at, or via, an input 322 to the obfuscated function X′, processes the encoded representation F(d) to generate an encoded to representation G(X(d)) of the processed data X(d), and provides the encoded representation G(X(d)) via an output 328. The encoded representation F(d) is the data d encoded using the function F. The encoded representation G(X(d)) is the data X(d) encoded using the function G. The obfuscated function X′ can be considered as:X′=G∘X∘F−1 where ∘ denotes function composition as usual (i.e. for any two functions a(x) and b(x), (a∘b)(x)=a(b(x)) by definition). The functions F−1, X, G are obfuscated in the implementation by combining them into a single lookup table. This combination of the functions into a single lookup table means that as long as the functions F and G remain unknown to an attacker, the attacker cannot extract information about the function X and hence cannot, for example, extract secret information (such as a cryptographic key) that is the basis for, or that is used by, the function X. Whilst the middle of FIG. 3 illustrates the obfuscated function X′ as the series of functions F−1, X and G, this is merely for the purpose of illustration. In particular, the obfuscated function X′ does not implement each of the functions F−1, X and G separately (as to do so would expose the data d and X(d) and the operation of the function X to an attacker)—instead, as mentioned above, the functions F−1, X and G are implemented together as a single function (such as via a look-up table), so that the obfuscated function X′ does not expose the data d and X(d) to an attacker and does not expose the processing or operation of the function X to an attacker.
Any given program can be thought of as a sequence or network of functions. FIG. 4 of the accompanying drawings illustrates an exemplary implementation 410 of a program or part of a program whereby two functions X and Y are to be evaluated sequentially (i.e. as part of a sequence) in order to provide the operation:(Y∘X)(d)=Y(X(d))In other words, the sequence of functions receives or obtains data d at, or via, an input 312 to the first function in the sequence, namely the function X, the function X then processes the data d to generate processed data X(d) and provides the processed data X(d) via an output 316, as discussed above. The processed data X(d) is provided via the output 316 of the first function X to an input 412 of the second function in the sequence of functions, namely the function Y, the function Y then processes the data X(d) to generate processed data Y(X(d)) and provides the processed data Y(X(d)) via an output 416. In this manner, the processed data Y(X(d)) provided at the output 416 of the second function Y is provided as the output from the sequence of functions X and Y. Again, each of the functions X and Y can respectively be implemented as one or more of instructions, code, logic or lookup tables or any combination thereof, as discussed above. However, when the implementation 410 of the sequence of functions X and Y is executed in a white-box environment, an attacker can observe and/or modify one or more of: the operation of each of the functions X and Y; the data d provided to the input 312 of the sequence of functions; the processed data Y(X(d)) provided at the output 416 of the sequence of functions; and the processed data X(d), which is provided to the input 412 of the second function Y from the output 316 of the first function X. Therefore, when the sequence of functions X and Y is executed as the implementation 410 in a white-box environment, the operation provided by that sequence of functions is susceptible to tampering. Where the implementation 410 of the sequence of functions X and Y form a key-dependent implementation of a cryptographic component for a program, for example, it may be possible for an attacker to extract or deduce a cryptographic key by observing or tampering with the functions X and/or Y and/or the data that is provided to/between them. To overcome this problem, the functions X and Y in the sequence of functions X and Y can be implemented as obfuscated versions X′ and Y′ of those functions X and Y respectively.
FIG. 4 further illustrates such an encoded or obfuscated implementation 420 of the sequence of functions X and Y—the implementation 420 comprises an obfuscated function X′ and an obfuscated function Y′. In the implementation 420, the obfuscated function X′ of the function X is formed by combining the function X with an input encoding F and an output encoding G, as described earlier in relation to FIG. 3. The obfuscated function Y′ of the function Y is formed in a similar manner to the obfuscated function X′, albeit that the input encoding G and output encoding H that are used for the implementation of obfuscated function Y′ may differ from the input encoding F and the output encoding G that are used for the implementation of obfuscated function X′. The obfuscated implementation Y′ of function Y can therefore be represented as:Y′=H∘Y∘G−1 The input encoding G used with obfuscated function Y′ should match the output encoding G used with the obfuscated implementation of the preceding function X′. This means that the representation of the processed data G(X(d)) provided at the output 328 of the obfuscated function X′ using the output encoding G can be used as the input to the obfuscated function Y′ which expects to receive the data X(d) represented using input encoding G (i.e. it expects to receive G(X(d))). It will be appreciated that whilst the function G is referred to as being the input encoding for the obfuscated function Y′ (since the data X(d) that is to be received at the input 328 to the obfuscated function Y′ is encoded with the function G such that it is the encoded representation G(X(d)) of the data X(d)), the actual function that is combined with the function Y to implement the obfuscated function Y′ is the inverse of the function G, namely the function G−1, which has the effect of cancelling out the input encoding G to allow the operation of the function Y on the data X(d).
The obfuscated function Y′ receives the data X(d) represented as G(X(d)) (i.e. encoded by the function G) from the output 328 of obfuscated function X′. The obfuscated function Y′ processes the encoded representation G(X(d)) of the processed data X(d) to generate an encoded representation H(Y(X(d))) of the processed data Y(X(d)), and provides the encoded representation H(Y(X(d))) via output 428. Since the obfuscated function Y′ is the last function in the sequence of functions, the output 428 of the obfuscated function Y′ is the output of the obfuscated implementation 420 of the sequence of functions.
Again, whilst the middle of FIG. 4 illustrates the obfuscated function Y′ as the series of functions G−1, Y and H, this is merely for the purpose of illustration. In particular, the obfuscated function Y′ does not implement each of the functions G−1, Y and H separately (as to do so would expose the data X(d) and Y(X(d)) and the operation of the function Y to an attacker)—instead, as mentioned above, the functions G−1, Y and H are implemented together as a single function (such as via a look-up table), so that the obfuscated function Y′ does not expose the data X(d) and Y(X(d)) to an attacker and does not expose the processing or operation of the function Y to an attacker.
It will be appreciated that in order for the representation of the output H(Y(X(d))) of the obfuscated implementation 420 of the sequence of functions to be correctly calculated, the input d to the implementation 420 must be represented as F(d) using the input encoding of the first obfuscated function in the sequence of obfuscated functions (i.e. F), whilst the output encoding of each obfuscated function in the sequence (except for the last obfuscated function in the sequence) must match the input encoding of the next function. The output encoding of the last obfuscated function in the sequence (i.e. H) dictates the representation of the output that is provided from the obfuscated sequence of functions (i.e. H(Y(X(d)))).
The obfuscated implementation 420 of the sequence of functions X and Y can therefore be represented as:Y′∘X′=(H∘Y∘G−1)∘(G∘X∘F−1)=H∘(Y∘X)∘F−1 
In this way, Y∘X is properly computed albeit that the input d needs to be encoded with the function F and the output H(Y(X(d))) needs to be decoded with the function H−1. Each obfuscated function X′ and Y′ can be separately represented in respective lookup tables, such that the functions H, Y and G−1 are combined in a table implementing the obfuscated function Y′ and the functions G, X and F−1 are implemented in a different table implementing the obfuscated function X′. By combining the functions into single lookup tables in this manner, the details of the functions X and Y, the data they operate on and output, as well as functions F, G and H are hidden. Meanwhile, the data X(d) that is passed between the lookup tables in the obfuscated implementation 420 is represented using the encoding G (i.e. as G(X(d))). This means that an attacker cannot observe any useful information in the data flows between the obfuscated functions in the obfuscated implementation 420.
The representation of the output G(X(d)) that is provided from the sequence of obfuscated functions will correspond to the output X(d) of the sequence of non-obfuscated functions encoded by the function G, assuming that the input data d is provided to the obfuscated sequence of functions represented as F(d) (i.e. encoded by the function F) and that no errors occur during processing.
The use of input and output encodings for the obfuscated implementation 420 of the sequence of functions has the effect that the obfuscated functionality is bound more tightly into the rest of the program or system in which implementation 420 operates. This is because the functions in the rest of the program or system which provide data to (or call) the obfuscated sequence of functions, provides a representation of the data encoded using the input encoding F, whilst the functions in the rest of the program or system which receive data from the obfuscated sequence of functions receive a representation of the processed data encoded using the output encoding H. Therefore, the effect of the obfuscation extends the code which an attacker would have to understand beyond the sequence of functions themselves into the surrounding functions or parts of the program. In the case where the obfuscated implementation 420 is a cryptographic component of a program, which will commonly be part of a larger containing system or application, the use of input and output encodings has the effect of extending the cryptographic boundary beyond the cryptographic algorithm itself further out into the containing system or application. This makes it harder to extract a key-specific implementation of the cryptographic algorithm from the rest of the application and forces an attacker to understand larger parts of the code in order to tamper with the software, thereby making the software harder to tamper with.
Although FIGS. 3 and 4 illustrate obfuscated functions which have both input and output encodings applied to them, it will be appreciated that it is possible to obfuscate a function by only combining either an input or an output encoding with the function. As an example, although not illustrated in FIG. 4, the obfuscated function X′ could be implemented so that it uses an output encoding G, but not input encoding F. Similarly, the obfuscated function Y′ could be implemented so that it uses an input encoding G, but not output encoding H. This arrangement can be represented as:Y′∘X′=(Y∘G−1)∘(G∘X)=Y∘X 
As a result, the input to the sequence of obfuscated functions could be the data d, which is the same representation of the input as would be provided to the non-obfuscated sequence of functions, and the output of the sequence of obfuscated functions would be Y(X(d)), which is the same representation of the output that would be provided by the non-obfuscated sequence of functions. However, the sequence of functions is still obfuscated in so far as an attacker is unable to observe the result of function X or the input of function Y. Therefore, provided that the details of the function G are unknown to the attacker, it will still be hard for an attacker to ascertain the details of these functions in order to extract a key.
Whilst FIG. 4 illustrates a sequence of two function X and Y that are then implemented as obfuscated functions X′ and Y′, it will be appreciated that any number of functions (in a series, network, chain, etc.) could be implemented as a series, network, chain, etc. of corresponding obfuscated functions.