A first entity X may desire to send an amount of data D1 to a second entity Y. The communication channel between the first entity X and the second entity Y may be an insecure or untrusted channel, insofar as data communicated across this channel may be inadvertently modified (due to noise on the channel) and/or may be deliberately modified (e.g. a malicious attacker may modify data communicated across this channel and/or a malicious attacker may include or inject new data into the communication channel, potentially whilst “pretending” to be the first entity X). The second entity Y may therefore receive data D2, where (a) the received data D2 may be the same as the initial data D1 (if no modification or corruption of the initial amount of data D1 has occurred), (b) the received data D2 may be a modified version of the initial data D1 sent by the first entity X to the second entity Y (e.g. if there has been noise added by the communication channel and/or modifications by an attacker) or (c) the received data D2 may be new data not originating from, or not based on data sent by, the first entity X (e.g. if an attacker is trying to introduce new/malicious data whilst pretending to be the entity X). The second entity Y may wish to only process (or provide functionality based on) the received data D2 if the second entity Y has confidence that the received data D2 originated from the first entity X and/or only process (or provide functionality based on) the received data D2 if the second entity Y has confidence in the integrity of the received data D2 (i.e. process data that has not been modified, or, put another way, only provide functionality if the received data D2 is the same as the initial data D1 that the first entity X sent to the second entity Y).
It is well-known to use a message authentication code (MAC) to address this situation. FIG. 1 of the accompanying drawings is a flowchart illustrating the use of a MAC.
At a step 100, the first entity X generates a MAC for the initial data D1. In particular, the first entity X generates a MAC M1, which is an amount of data or a value (e.g. a checksum) based on the data D1, using a MAC function F, i.e. M1=F(D1). In general, the MAC function F is a keyed (or cryptographic) hash function or a so-called keyed (or cryptographic) one-way-function. In other words, the function F may use a secret key K shared by both the first entity X and the second entity Y so that only the first entity X and the second entity Y know the configuration/settings for the MAC function F that is to be performed. Additionally, the MAC function F is a function such that, given the MAC value M1 (and possibly even the key K) it is computationally infeasible to create a further amount of data D* such that F(D*)=M1. An example of such a MAC function (or algorithm or process) is SHA-1, details of which can be found at http://en.wikipedia.org/wiki/SHA-1 (the entire contents of which are incorporated herein by reference).
At a step 102, the first entity X sends both the amount of data D1 and the MAC M1 to the second entity Y.
At a step 104, the second entity Y receives an amount of data D2 and a value M2. The second entity will use, or treat, the value M2 as a MAC value which is meant to correspond to the received data D2. If the data sent over the communication channel has not been corrupted or modified, then the amount of data D2 is the amount of data D1 and the MAC value M2 is the MAC value M1. However, if there has been corruption of the data sent over the communication channel it is possible that the amount of data D2 is different from the amount of data D1 and/or the MAC value M2 is different from the MAC value M1. Indeed, if an attacker has introduced completely new data into the communication channel and sent that new data to the second entity Y, then the amount of data D2 may be completely unrelated to the amount of data D1 and the MAC value M2 may be completely unrelated to the MAC value M1. However, the second entity Y can distinguish between valid (i.e. uncorrupted or authentic) data and invalid (i.e. corrupted or inauthentic) data, as set out below.
At a step 106, the second entity may generate a MAC M3 based on the received amount of data D2, i.e. M3=F(D2). The second entity Y uses the same MAC function F, configured in the same way as for the first entity X (e.g. using the same key K), as was used by the first entity X at the step 100 when the first entity X generated the MAC M1 based on the initial data D1.
At a step 108, the second entity Y performs a comparison operation to determine whether the received MAC M2 is the same as the generated MAC M3 (i.e. whether M3=M2).
If the received MAC M2 is the same as the generated MAC M3 (i.e. if M3=M2), then at a step 110, the second entity Y can assume that (a) the received data (D2 and M2) is the same as the initial data (D1 and M1) sent by the first entity X and (b) the received data (D2 and M2) originated from the first entity X. This is because only the first and second entities share the secret K and because it is computationally infeasible for an attacker to create a further amount of data D* such that F(D*)=M1. Therefore, at the step 110, the second entity Y may perform data processing on the basis that the received data (D2 and M2) is authentic (i.e. on the basis that the integrity and origin of the received data (D2 and M2) have been successfully verified).
If, on the other hand, the received MAC M2 is not the same as the generated MAC M3 (i.e. if M3≠M2), then at a step 112, the second entity Y can assume that (a) the received data (D2 and M2) is not the same as the initial data (D1 and M1) sent by the first entity X and/or (b) the received data (D2 and M2) did not originate from the first entity X. Therefore, at the step 112, the second entity Y may perform data processing on the basis that the received data (D2 and M2) is not authentic (i.e. on the basis that the integrity and/or origin of the received data (D2 and M2) have not been successfully verified).
More information on MACs and how they can be used can be found at http://en.wikipedia.org/wiki/Message_authentication_code (the entire contents of which are incorporated herein by reference).
A “white-box” environment is an execution environment for an item of software in which an attacker of the item of software is assumed to have full access to, and visibility of, the data being operated on (including intermediate values), memory contents and execution/process flow of the item of software. Moreover, in the white-box environment, the attacker is assumed to be able to modify the data being operated on, the memory contents and the execution/process flow of the item of software, for example by using a debugger in this way, the attacker can experiment on, and try to manipulate the operation of, the item of software, with the aim of circumventing initially intended functionality and/or identifying secret information and/or for other purposes. Indeed, one may even assume that the attacker is aware of the underlying algorithm being performed by the item of software. However, the item of software may need to use secret information (e.g. one or more cryptographic keys), where this information needs to remain hidden from the attacker. Similarly, it would be desirable to prevent the attacker from modifying the execution/control flow of the item of software, for example preventing the attacker forcing the item of software to take one execution path after a decision block instead of a legitimate execution path.
There are numerous techniques, referred to herein as “white-box obfuscation techniques”, for transforming the item of software 12 so that it is resistant to white-box attacks. Examples of such white-box obfuscation techniques can be found, in “White-Box Cryptography and an AES Implementation”, S. Chow et al, Selected Areas in Cryptography, 9th Annual International Workshop, SAC 2002, Lecture Notes in Computer Science 2595 (2003), p 250-270 and “A White-box DES Implementation for DRM Applications”, S. Chow et al, Digital Rights Management, ACM CCS-9 Workshop, D R M 2002, Lecture Notes in Computer Science 2696 (2003), p 1-15, the entire disclosures of which are incorporated herein by reference. Additional examples can be found in U.S. 61/055,694 and WO2009/140774, the entire disclosures of which are incorporated herein by reference. Some white-box obfuscation techniques implement data flow obfuscation see, for example, U.S. Pat. Nos. 7,350,085, 7,397,916, 6,594,761 and 6,842,862, the entire disclosures of which are incorporated herein by reference. Some white-box obfuscation techniques implement control flow obfuscation see, for example, U.S. Pat. Nos. 6,779,114, 6,594,761 and 6,842,862 the entire disclosures of which are incorporated herein by reference. However, it will be appreciated that other white-box obfuscation techniques exist.