1. Field of the Invention
The invention relates to the design of software products and the production of copies thereof for commercial distribution. More particularly, the invention relates to methods and systems for producing multiple copies of a software product wherein a copy of the software product is functionally identical to all other copies while being structurally unique.
2. Description of Prior Art
The co-pending U.S. patent application Ser. No. 09/328,737, xe2x80x9cMethods and Apparatus for Secure Distribution of Software,xe2x80x9d A. Torrubia-Saez, Jun. 9, 1999 discloses various methods of producing and commercially distributing access-controlled software packages via data transmission. Such access-controlled packages prevent a user from making full use of a downloaded software product until a purchase transaction has been completed. Further disclosed are methods of watermarking executable objects and data objects to prevent unauthorized use or copying. However, methods and systems for producing different copies of a software application in which each copy is functionally identical and structurally unique are not described.
FIG. 1 provides a flowchart of a conventional process for designing and producing software products. During the design phase, programmers or software engineers typically write the software application in a high-level programming language such as C or C++. At this stage of production, the application generally consists of several modules, each written in the programming language of choice. A compiler 10 then translates the high-level code to assembly code. Subsequently, an assembler 11 translates the assembly code to object code, machine-readable code consisting entirely of binary instructions. A linker 12 takes the separate modules of object code and combines them with required routines from external libraries to produce an executable program 13. Multiple identical copies 14 are made of this executable file and distributed. Typically, when a user installs the application to their computer, they must supply a serial number or an authorization code in order for the setup program to complete the installation.
R. Davidson and N. Myhrvold, xe2x80x9cMethod and System for Generating and Auditing a Signature for a Computer Program,xe2x80x9d U.S. Pat. No. 5,559,884, Sep. 24, 1996, provide a means of uniquely identifying an authorized version of an executable module. Davison, et al. teach selecting several portions of executable code, and reordering and modifying the executable code portions in a manner that preserves the original flow of execution. The new placement order of the code sections forms a signature for the version that is difficult to detect or remove. S. Moskowitz and M. Cooperman, xe2x80x9cMethod for Stega-Cipher Protection of Computer Code,xe2x80x9d U.S. Pat. No. 5,745,569 (Apr. 28, 1998) provide a method for protecting computer code copyrights by encoding a digital watermark into essential portions of code resources. Encoding the code portions into a data resource further conceals the watermark. The watermark may not be removed without destroying the functionality of the software application.
In either of the disclosed prior art methods, the typical end user would indeed have difficulty detecting and removing the security mechanisms without damaging the executable code. However, the modifications to the executable code are simple enough that a skilled cracker or pirate could detect and eliminate at least a portion of the signature or watermark without disrupting the normal flow of execution by disassembling or reverse compiling the executable code and examining the resulting source code. Several people, each having a version of the program and comparing copies, could accomplish the task even more easily. Furthermore, because each version of the software is identical, the method of the Moskowitz, et al. teachings is vulnerable to circumvention by patching. If a program has security logic built in to prevent unauthorized copying or use, a cracker can remove the security from a copy by creating a small patch program that patches byte locations to remove security checks. For example if a program has the code:
0X123001: call checksecurity
0X123008: test ax,ax
0X123009: je badguy
A cracker would merely have to patch 0X123009 with two NOP operands to override the security protection. Since all copies of the program are the same, patching location 0X123009 would disable security in all copies of the program.
Thus, as a means of uniquely identifying and controlling every copy of a computer program, it would be desirable to provide a way of producing copies of the program in which a copy is functionally identical to all other copies and structurally unique, so that each copy actually constitutes a separate version of the program. It would be desirable to produce these structurally unique copies by selecting portions of the source code to the application at random and merging individual procedures into larger procedures. It would be desirable to further modify the source code by reordering instructions in a manner that conserves the normal flow of execution. It would be desirable to make the code modifications extremely difficult to detect and render the program invulnerable to reverse engineering by modifying the source code still further with insertion of randomly selected dummy opcodes between the instructions, so that the executable code could not be reverse-compiled or disassembled.
The invention provides a method and a system for producing different versions of a software application wherein each version is functionally identical to all other versions and structurally unique. The intended utility for the invention is as a security measure in a system for secure distribution of software to provide a unique fingerprint for each copy of a software application distributed in response to a purchase request. The invention has the further utility of preventing unauthorized use and copying. The invention also incorporates measures for guarding the software application against reverse-engineering.
The invention, embodied herein as a system and a method, operates on the assembly code to a software application. The method of the invention is a pseudo-random procedure driven by a seed and a set of preferences. Various modifications are made to selected instruction sequences in the assembly code, the selection of the instruction sequences being pseudo-random. In a first step, multiple procedures within an instruction sequence are blended, or merged, so that a single, larger procedure is formed that is functionally equivalent to the original procedures. In a second step, the instructions within an instruction sequence are shuffled, or reordered, in a manner that conserves the original order of execution. In a third step, dummy opcodes are inserted between instructions in an instruction sequence, preceded by a branching instruction from one instruction to the next one following it in the sequence. The steps of the invented method may be executed in any order, or they may be performed simultaneously. It is unnecessary that all steps of the invented method be applied to all selected instruction sequences. The insertion of sufficiently large dummy opcodes provides a measure of protection against reverse-engineering because they will confuse a disassembly program and cause it to disassemble the executable code incorrectly.