The need to protect information contained in software programs and hardware designs, or to provide tamper protection, is not new. Many mechanisms have been applied to achieve such objectives.
The article entitled "Operating System Protection Through Program Evolution" by F. B. Cohen in Computer & Security, Vol. 12, (1993) pp. 565-584 proposes such a mechanism. It describes attacks and defense of a computer operating system, as follows:
"One of the major factors in the successful application of information protection techniques is the exploitation of computational advantage. Computational advantage shows up historically in cryptography, where Shannon's theory clearly demonstrates the effect of "workload" on the complexity of cryptanalysis, and introduces the concept of diffusion and confusion as they relate to statistical attacks on cryptosystems. Most modern cryptosystems exploit this as their primary defenses. The same basic principle applies in computer virus analysis in which evolutionary viruses drive the complexity of detection and eradication up dramatically and in password protection in which we try to drive the number of guesses required for a successful attack up by limiting the use of obvious passwords. One of the major reasons attacks succeed is because of the static nature of defense, and the dynamic nature of attack." (page 565) PA1 "The ultimate attack against any system begins with physical access, and proceeds to disassembly and reverse engineering of whatever programmed defenses are in place. Even with a cryptographic key provided by the user, an attacker can modify the mechanism to examine and exploit the key, given ample physical access. Eventually, the attacker can remove the defenses by finding decision points and altering them to yield altered decisions." (page 565-66) PA1 "Without physical protection, nobody has ever found a defense against this attack, and it is unlikely that anyone ever will. The reason is that any protection scheme other than a physical one depends on the operation of a finite state machine, and ultimately, any finite state machine can be examined and modified at will, given enough time and effort. The best we can ever do is delay attack by increasing the complexity of making desired alterations." (page 566) PA1 "The ultimate defense is to drive the complexity of the ultimate attack up so high that the cost of attack is too high to be worth performing. This is, in effect, security through obscurity, and it is our general conclusion that all technical information protection in computer systems relies at some level either on physical protection, security through obscurity, or combinations thereof. PA1 The goal of security through obscurity is to make the difficulty of attack so great that in practice it is not worth performing even though it could eventually be successful. Successful attacks against obscurity defenses depend on the ability to guess some key piece of information. The most obvious example is attacking and defending passwords, and since this problem demonstrates precisely the issues at hand, we will use it as an example. In password protection, there are generally three aspects to making attack difficult. One aspect is making the site of the password space large, so that the potential number of guesses required for an attack is enormous. The second aspect is spreading the probability density out so that there is relatively little advantage to searching the space selectively. This is basically the same as Shannon's concept of diffusion. The third aspect is obscuring the stored password information so that the attacker cannot simply read it in stored form. This is basically the same as Shannon's concept of confusion." (page 566) PA1 "A more practical solution to this problem might be the use of evolutionary defenses. To make such a defensive strategy cost effective for numerous variations (e.g. one per computer worldwide), we probably have to provide some sort of automation. If the automation is to be effective, it must produce a large search space and provide a substantial degree of confusion, and diffusion. This then is the goal of evolutionary defenses. PA1 Evolution can be provided in many ways and at many different places, ranging from a small finite number of defenses provided by different vendors, and extending toward a defensive system that evolves itself during each system call. With more evolution, we get less performance, but higher cost of attack. Thus, as in all protection functions, there is a price to pay for increased protection. Assuming we can find reasonably efficient mechanisms for effective evolution, we may be able to create a great deal of diversity at practically no cost to the end-user, while making the cost of large scale attack very high. As a very pleasant side effect, the ultimate attack may become necessary for each system under attack In other words, except for endemic flaws, attackers may again be reduced to a case-by-case expert attack and defense scenario involving physical access." (page 567)
The article proposes an evolutionary defense as follows:
A large number of patents exist which describe various ways of protecting software and/or hardware and information contained therein. The following are only a few examples of patents in the related field.
According to U.S. Pat. No. 4,525,599 issued on Jun. 25, 1985 to Curran et al. and entitled "Software Protection Methods and Apparatus", in order to protect copying of ROM-resident software a protection circuit includes encryption/decryption means which is coupled between the microprocessor and ROM memory.
According to U.S. Pat. No. 4,634,807 issued on Jan. 6, 1987 to Chorley et al. and entitled "Software Protection Device", in order to prevent the unauthorized copying of software a software module of this invention is encrypted using DES and the key is encrypted using the public key of a public/private key algorithm.
In U.S. Pat. No. 4,740,890 issued on Apr. 26, 1988 to William and entitled "Software Protection System With Trial Period Usage Code and Unlimited Use Unlocking Code Both Recorded on Program Storage Media", after the trial period, the disk becomes inoperable as the system will prevent further use of the program until a proper locking code is inserted.
U.S. Pat. No. 4,866,769 issued on Sep. 12, 1989 to Karp and entitled "Hardware Assist for Protecting PC Software" describes a copy protection technique of PC software. By this technique, a unique ID is stored in ROM of a personal computer in which software on a diskette is to be used. This ID is accessible to the user of the computer. A vendor who wishes to protect his diskette-distributed software from illegal copying or use provides a source ID on the diskette.
According to U.S. Pat. No. 4,903,296 issued on Feb. 20, 1990 to Chandra et al. and entitled "Implementing a Shared Higher Level of Privilege on Personal Computers for Copy Protection of Software", the original medium is functionally unreproducible until it is modified by the execution of a program stored in a tamperproof co-processor which forms a part of the computing machine.
The license management system of U.S. Pat. No. 4,937,863, issued on Jun. 26, 1990 to Robert et al. and entitled "Software Licensing Management System", maintains a license unit value for each licensed program and a pointer to a table identifying an allocation unit value associated with each use of the licensed program. In response to a request to use a licensed program, the license management system responds with an indication as to whether the license unit value exceeds the allocation unit value associated with the use.
U.S. Pat. No. 5,047,928 issued on Sep. 10, 1991 to Wiedemer and entitled "Billing System for Computer Software" teaches a billing system in which the application program is enciphered in accordance with an algorithm driven by a numeric key. The user's computer is provided with a hardware security module and a removable billing module, both of which carry unique codes.
The system of U.S. Pat. No. 5,123,045, issued on Jun. 16, 1992 to Ostrovsky and entitled "Comprehensive Software Protection System", provides pattern of access protection to memory during execution of a program and also provides protection of the data stored in memory. The patent describes a data processing system which includes a plurality of "buffer" data structures for storing encrypted software and data in unprotected memory. The software and data are stored in accordance with pseudo-random mapping such that the pattern of access during execution of the program reveals no information to adversarial observers. The scheme is secure assuming the existence of a physically shielded chip containing a constant number of registers and the existence of any one-way function.
In U.S. Pat. No. 5,212,728 issued on May 18, 1993 to Glover et al. and entitled "Dynamic Trace Elements", tracer circuitry connects to the rest of the circuitry of a product but its function has nothing to do with the actual operation of the product. One or more lines of tracer code are embedded in lines of real code. The tracer software code interacts with the tracer circuitry. Even though the tracer software code does nothing with respect to the running of the real software code, it reacts with actual hardware, i.e. the tracer circuitry. A copier who has disassembled the program would have considerable difficulty in determining this fact. In another embodiment, one or more lines of tracer codes can be embedded in the real code but they interact with lines of real code to produce results which are not related to the operation or running of the real code.
In the protection scheme of U.S. Pat. No. 5,287,407, issued on Feb. 15, 1994 to Holmes and entitled "Computer Software Protection", a master copy of a software file has within it a predetermined block of data. When a copy of the file is made, that block of data within the copied file is located and overwritten with data identifying the copied file. When an unauthorized copy is found, the data identifying the copy can be read and the source of the unauthorized copy may be traced.
Generally speaking, protection techniques including some of those discussed above can be understood, basically, as applying the opposites of "clear design" principles. In engineering software or hardware, there are certain principles which are applied to make the design clear, understandable, manageable, and well organized. In software, such principles are called "principles of software engineering". Now, plainly, if application of a set of principles makes designs easier to understand and modify, then application of their opposites is likely to make designs harder to understand and modify.
In software, for example, the choice of mnemonic variable names which suggest the actual uses of the variables is important to program understanding. Hence choosing variable names which either suggest nothing about their use, or suggest a use different from actual uses, would make understanding and modifying the software more difficult.
Let us call the reverse application of "clear design principles" by the name "anti-clear design".
The present invention is analogous in purpose and intended effect with such approaches, that is to say, both "anti-clear design" approaches and the present invention are intended to protect intellectual property and frustrate (effective) tampering. However, the basis for the instant inventive process is not in applying the opposites of "clear design principles". The invention differs in two profound ways from such previous approaches.
Firstly, the previous approaches don't work against a truly determined attack. Many kinds of obfuscation are easily penetrated in "anti-clear design" approaches by the kinds of analysis tools found in, for example, data flow analyzers, optimizing compilers, or program slicing tools.
Secondly, the instant process is founded on notions from the Kolmogorov complexity and computational graph theory, not on reversing the rules of thumb from software engineering or the principles of clear hardware design. Hence where the kinds of operations employed in an "anti-clear design" process are shallow and local, those involved in the instant process are deep and global.
The Kolmogorov complexity theory provides a way of measuring the "essential" information content of a piece of information (of any kind). In ordinary information theory, for example, the information represented by sending a message consisting of the binary encoding of the number .pi.=3.14156 . . . is infinite: there are infinitely many digits and the number is non-repeating. However, the essential information is not infinite: it is possible to define the string of digits in terms of a small program which computes as many digits of .pi. as desired. Since the program is small, the amount of "essential" information in .pi. is also small. In the present disclosure, we use analogous notation to deal with the essential complexity of a computer program P and the essential complexity of deriving another program Q from a program P. For an introduction to Kolmogorov complexity, reference can be made to the "Handbook of Theoretical Computer Science", Elsevier/MIT Press, ISBN 0-444-88074-7 (Volume A, Chapter 4).
Extending this concept, we can measure the magnitude of the difference between programs by what we call the Kolmogorov directed distance between them. For any fixed program alphabet and encoding method, the Kolmogorov directed distance from program P to program Q is defined as the length of the smallest program which takes P as input and yields Q as output. Although this distance will vary from one encoding to another, the variations are sharply restricted according to the invariance theorem of the Kolmogorov complexity theory. Reference can also be made to "An Introduction to Kolmogorov Complexity and its Applications" by Ming Li and Pau Vitanyi, ISBN 0-387-94053-7: Section 2.1: "The Invariance Theorem". Note the quote in Example 2.2: "The Invariance Theorem in fact shows that, to express an algorithm succinctly in a program, it does not matter which programming language we use (up to a fixed additive constant that depends only on the two programming languages compared)".
A design of the kind produced by the process of the invention is analogous to an encrypted message, where discovery of the message without knowledge of the key is possible in principle, but is so difficult in practice that the message is only very rarely discovered without the key. Despite the analogy with cryptography in terms of purpose and intended effect, however, the invention is not cryptographic. A software program or hardware device resulting from the present process is executable "as is": its information does not need to be decoded for use. The process takes information which is executable (software which can be run on a computer or a hardware design which can be used to produce an integrated circuit or other hardware device), and transforms it into a new form which is still executable, but which protects the content from both disclosure and tampering.
The process preserves the value of designs: there is no need for decryption to recover the value, and no need for a key to access the value. An encrypted message, however, has value only in connection with its key, and its value (the information in the message) can only be obtained by decryption. Without the key or decryption, the encrypted message itself is of no practical value. Designs encoded according to the invention are useful in themselves with no key: encrypted messages are useful only with a key and only when decrypted.