Many encryption algorithms (processes) are primarily concerned with producing encrypted data that is resistant to decrypting by an attacker who can interact with the encryption algorithm only as a “Black Box” (input-output) model, and cannot observe internal workings of the algorithm or memory contents, etc. due to lack of system access. The Black Box model is appropriate for applications where trusted parties control the computing systems for both encoding and decoding ciphered materials.
However, many applications of encryption do assume that an attacker can access internal workings of the algorithm. For example, encrypted digital media often needs to be decrypted on computing systems that are completely controlled by an adversary (attacker). There are many degrees to which the Black Box model can be relaxed. An extreme relaxation is called the “White Box” model. In a White Box model, it is presumed that an attacker has total access to the system performing an encryption, including being able to observe directly a state of memory, program execution, modifying an execution, etc. In such a model, an encryption key can be observed in or extracted from memory, and so ways to conceal operations indicative of a secret key are important. Thereby a White Box cryptographic process is secure even in an untrusted computing environment.
Classically, software implementations of cryptographic building blocks are insecure in the White Box threat model where the attacker controls the execution process. The attacker can easily lift the secret key from memory by just observing the operations acting on the secret key. For example, the attacker can learn the secret key of an AES software implementation by observing the execution of the key schedule algorithm.
Hence there are two basic principles in the implementation of secure computer applications (software). The Black Box model implicitly supposes that the user does not have access to the computer code nor any cryptographic keys themselves. The computer code security is based on the tampering resistance over which the application is running, as this is typically the case with SmartCards. For the White Box model, it is assumed the (potentially hostile) user has partially or fully access to the implemented code algorithms; including the cryptographic keys themselves. It is assumed the user can also become an attacker and can try to modify or duplicate the code since he has full access to it in a binary (object code) form. The White Box implementations are widely used (in particular) in content protection applications to protect e.g. audio and video content.
Software implementations of cryptographic building blocks are insecure in the White Box threat model where the attacker controls the computer execution process. The attacker can easily extract the (secret) key from the memory by just observing the operations acting on the secret key. For instance, the attacker can learn the secret key of an AES cipher software implementation by passively monitoring the execution of the key schedule algorithm. Also, the attacker could be able to retrieve partial cryptographic result and use it in another context (using in a standalone code, or injecting it in another program, as an example).
Content protection applications such as for audio and video data are one instance where it is desired to keep the attacker from finding the secret key even though the attacker has complete control of the execution process. The publication “White-Box Cryptography in an AES implementation” Lecture Notes in Computer Science Vol. 2595, Revised Papers from the 9th Annual International Workshop on Selected Areas in Cryptography pp. 250-270 (2002) by Chow et al. discloses implementations of the well known AES cipher that hide the operations performed during AES encryption/decryption by using table lookups (also referred to as TLUs) to hide the secret key within the table lookups, and hide intermediate state information that would otherwise be available in arithmetic implementations of AES. In the computer field, a table lookup is an operation consisting of looking up a value in a table (also called an array) at a given index position in the table.
Chow et al. (for his White Box implementation where the key is known at the computer code compilation time) uses 160 separate tables to implement the 11 AddRoundKey operations and 10 SubByte Operations (10 rounds, with 16 tables per round, where each table is for 1 byte of the 16 byte long—128 bit—AES block). These 160 tables embed a particular AES key which is known at the time of compilation of the associated software (source code), such that output from lookups involving these tables embeds data that would normally result from the AddRoundKey and SubByte operations of the AES algorithm, except that this data includes input/output permutations that make it more difficult to determine what parts of these tables represent round key information derived from the AES key. Chow et al. provide a construction of the AES algorithm for such White Box model. The security of this construction resides in the use of table lookups and masked data. The input and output masks applied to this data are never removed during the process. In this solution, there is a need to know the key value at the code compilation time, or at least to be able to derive the tables from the original key in a secure environment.
The conventional implementation of a block cipher in the White Box model is carried out by creating a set of table lookups. Given a dedicated cipher key which is known at the code compilation time, the goal is to store in a table the results for all the possible input messages. This principle is applied for each basic operation of the block cipher. In the case of the AES cipher, these are the shiftRow, the add RoundKey, the subByte and the mixColumns operations.
Hash functions are also well known in the field of data security and cryptography. The principle is to take data (a digital message, digital signature, etc.) and use it as an entry to a hash function resulting in an output called a “digest” of predetermined length which is intended to uniquely identify (“fingerprint”) the message. A secure (cryptographic) hash is such that any alteration in the message results in a different digest, even though the digest is much shorter than the message. Such hash functions are “collision-resistant” and “one-way” examples of a compression function.
A hash function thus is a deterministic procedure that accepts an arbitrary length input value, and returns a hash value of fixed or defined size. The input value is called the message, and the resulting output hash value is called the digest. The message integrity check is done by comparing the computed digest to an expected digest associated with the message.
Cryptography and data security deal with digital signatures, encryption, document authentication, and hashing. In all of these fields, there is a set of basic tools/functions which are widely used: hash functions. Several properties are required for the use of hash functions in cryptographic applications: preimage resistance, second preimage resistance and collision resistance.
In the recent years, much energy has been expended finding new hash functions, since collisions (weaknesses or successful attacks) have been found in the widely used SHA-0/1 and MD5 standard hash functions. After this security crisis involving MD5 and SHA-0/1, two hash function standards used for a long time without concern for their security, the U.S. NIST (National Institute of Standard and Technology) launched an international competition to define the new standard for hash functions. The competition started in 2008. Amongst the competitors, many were broken easily, since the submitters were not really aware of the cryptographic issues. Such hash functions are used in computing HMAC (Hash-based Message Authentication Code) which uses a cryptographic hash function in combination with a secret key to verify the authenticity and integrity of a message.