The nature of computer software renders it susceptible to analysis and copying by third parties. There have been considerable efforts to enhance software security, see for instance U.S. Pat. No. 6,668,325 assigned to Intertrust Technologies Inc. There have been several efforts to provide technical protection for software. A well-known protection approach is called obfuscation, which typically relies on a rearrangement of the source code. Computer code (software or programs) comes in two chief types; the first is source code, which is as written by a human being (programmer) in a particular computer language. The source code itself is often then obfuscated. The other chief type is called object code or compiled code or binary code or machine code. This is the source code after having being processed by a special type of computer software program called a compiler; a compiler is routinely provided for each computer language. The compiler takes as input the alphanumeric character strings of the source code as written by the programmer, and processes them into a string of binary ones and zeros, which can then be operated on by a computer processor.
It is also known to obfuscate the compiled (object) code. The term “code morphing” is also applied to obfuscating object code. This is typically achieved by completely replacing a section of the object code with an entirely new block of object code that expects the same machine (computer or processor) state when it begins execution as a previous code section and will leave with the same machine state after execution as does the original code (thereby being semantically equivalent code). However, typically a number of additional operations compared to those of the original code will be completed, as well as some operations with an equivalent effect, by the morphed code. Code morphing makes disassembly or decompiling of such a program much more difficult. This is typically the act of taking the machine code and transforming it back into source code, and is done by reverse engineers or “hackers” who wish to penetrate the object code, using a special decompiler program. A drawback with code morphing is that by unnecessarily complicating operations and hindering compiler-made optimizations, the execution time of the obfuscated object code is increased. Thus typically code morphing is limited to critical portions of a program and so is often not used on the entire computer program application. Code morphing is also well known for obfuscating copy protection or other checks that a program makes to determine whether it is a valid, authentic installation or a pirated copy, for security purposes.
Therefore, typically the goal of obfuscation is to start with the original code and arrive at a second form of the code, which is semantically or logically equivalent from an input/output point of view. As pointed out above, this means that for any input to the code in the field of possible inputs, the output value of the code is the same for both the original code and the obfuscated code. Thus a requirement of successful obfuscation is to produce a semantically equivalent (but also protected) code to the original (unprotected) code.
As well known, computer programs called obfuscators or tools perform the obfuscating; they transform a particular software application (program) in source or object code form into one that is functionally identical to the original, but is much more difficult for a hacker to penetrate, that is to decompile. Note that the level of security from obfuscation depends on the sophistication of the transformations employed by the obfuscator, the power of the available deobfuscation algorithms as used by the hacker, and the amount of resources available to the hacker. The goal in obfuscating is to provide many orders of difference between the cost (difficulty) of obfuscating vs. deobfuscating.
Hence it is conventional that the obfuscation process is performed at one location or in one secure computer (machine) after the source code has been written. The obfuscated source code is compiled and then transferred to a second (insecure) computing device, where it is executed after installation in associated memory at the second computing device. (Note that the normal execution does not include any decompiling since there is no need on a machine-level basis to restore the source code. Decompiling is strictly done for reverse engineering purposes.) At the second (recipient) computing device, the obfuscated code is installed and then can be routinely executed by the processor at the second computing device. The obfuscated code is executed as is. Generally it is slower to execute than the original code.
Implementations of security related computer code running on “open platform” (insecure) systems are often subject to attack in order to recover cryptographic materials (keys, etc.), cryptographic algorithms, etc. The attacks are also referred to here by the term “reverse-engineering”, which is the way to recover code internals from a software binary (object code). Open platform means that internal operations of the computing system are observable by an attacker. This also means that under some circumstances, the attacker can break into the computer programs, modify values, modify instructions, or inject code.
Several solutions are known to protect computer software code against reverse-engineering. They are implemented to make more complex the work of attackers in understanding the process, or to hide cryptographic data or operation.
In obfuscation, the code is typically re-written by a person referred to as a software developer (programmer) who reviews the source code and makes the necessary changes, or by using a software “tool” which does the same tasks as the developer, in a very complex way. Then an attacker must do substantial additional work to recover something (humanly) understandable from the object code. This obfuscation includes—for instance—re-writing loops, splitting basic blocs of instructions (adding a jump in the code, using predicates), flattening the control flow (not executing linear blocks of code), etc.
In the field of digital content protection, hiding data is necessary since it helps keep some values, and what the program is doing, unknown to an attacker. The goal of obfuscation is to create computer code as hard to understand as possible for an attacker. For instance, assume that one wants to hide in computer memory data designated D that is used with a Boolean exclusive OR operation (the XOR), with another value designated X. The problem is how to compute X XOR D, without revealing the value of D?
One known way to hide D, while computing X XOR D is:                Store D′=D XOR M1 in memory for a mask value M1,        Compute X′=X XOR M2 for a second mask value M2,        Compute Y′=X′ XOR D′ (this is equal to X XOR D XOR M1 XOR M2),        Compute Y=Y′ XOR (M1 XOR M2) (which by definition is equal to X XOR D).        
It is assumed here that data (variables) X, D are expressed numerically, in binary (1's and 0's) form.
The variables M1 and M2 are used to mask (hide) the values of D and X in the memory; an attacker who retrieves D′ in memory has to find M1 to retrieve D. Furthermore, the value D′ may have been computed on a safe (secure) server not accessible to the attacker, such that it looks complicated to recover D and M1.
However, this method has drawbacks:                1. It is easy to retrieve the mask variables from several pairs (D, D′) of masked/unmasked data. Indeed, if an attacker is able to obtain a single data pair (D, D′), he can retrieve mask M1 by computing:M1=D XOR D′                    Unmasking any other masked data (having the same mask value) is then easy.                        2. The mask value M1 can be computed from data in the code, since M2 and (M1 XOR M2) must appear in the code in order to mask X and to unmask Y′.        
Note that in practice, the attack is not so easy, since implementation of this is done in a complex way with split data, fake operations, and it is not so easy in the middle of many operations to retrieve the useful (for the attacker) elements.