It is generally desirable for the manufacturer and/or distributor of software to control the distribution of such software, in particular to be able to protect software against theft, establish/prove ownership of the software, validate software and/or identify/trace copies of distributed software. Hence, efficient techniques for watermarking of computer software, in particular of source code or object code are desirable. The purpose of such watermarking techniques is to add information—a watermark or simply a mark—in the software, e.g. by manipulating/altering or adding program code. The information may be used as a copyright notice, for identification purposes, e.g. to identify the buyer of the software, or the like. It is generally desirable that the information is embedded in such a way that this information cannot be removed by the buyer but that it can be extracted from the software using knowledge about the process that put the mark into the software. In particularly, a watermark is said to be stealthy if the watermark is not easily detectable (e.g. by statistical analysis). A watermark is said to be resilient, if it is able to survive semantic-preserving transformations such as code obfuscation or code optimization, and/or able to survive collision attacks.
In general, a watermark may be subject to different attacks in order to render the mark unrecognisable. Examples of kinds of attacks include:                Additive attacks: New watermarks are added to the code so that the original mark no longer can be extracted, or, to make it impossible to determine which is the original mark.        Distortive attacks: The code is subjected to semantic-preserving transformations such as code obfuscation and code optimization in hope that the watermark will be distorted and not able to be recognized.        Subtractive attacks: The location of the watermark is determined and the mark is cropped out of the program.        Collusion attacks: Different marked programs are used to determine the location of the mark.        
Thus, it is a general problem to provide watermarking techniques that yield markings that are robust under such attacks, e.g. by the buyer of the software.
When the embedded watermark is detectable, it can be removed (cropped out) from the program or be replaced by an equivalent expression, which very likely destroys the original mark. In existing solutions embeddings are often relatively easy too identify and thus can be cropped out.
The article “Watermarking, tamper-Proofing, and Obfuscation—Tools for software Protection”, by Christian Collberg et al., IEEE Transact. On softw. Eng., Vol. 28, No. 8, p. 735-746 describes watermarking of program code.
Obfuscation is a technique used to complicate code, i.e. to transform the program code into one that has the same observable behaviour but for which the program code is more difficult to understand. The technique is used in order to make software harder to reverse engineer. It typically involves renaming, reordering, spitting/merging, loop transformations, etc. Hence, obfuscation makes code harder to understand when it is de-compiled, but it typically has no effect on the functionality of the code. U.S. Pat. No. 6,668,325 discloses a number of code obfuscation techniques, that may be used in a watermarking context.
However, even though the above prior art methods provide a watermarking of computer program code, it remains a problem to provide a watermarking technique that results in watermarks that are more difficult to detect when studying the marked software.
In particular the embedding of watermarks by simple obfuscating changes in the program code, e.g. by renaming of variables, reordering of instructions, loop transformations, etc. involve the problem that they are not sufficiently resilient, since obfuscation techniques typically change exactly these properties, thereby rendering the watermark vulnerable towards an obfuscating attack.
Furthermore, it remains a problem to provide a watermarking technique that allows a robust way of Identifying the origin of a specific copy of the software.