Many attempts have been made in the recent years to protect original computer software from duplication and mass distribution. One of the methods used today involves the requirement of a license or a sequence key, which is entered manually by the customer, during installation or during run time. Another popular method for preventing duplicate usage of software involves the activation of the software after installation. The activation process requires the software to read ID serial numbers of hardware elements in the computer, such as the processor's serial number or graphics card's serial number. Once the hardware ID serial numbers are read, they may be sent together with the software ID number through the Internet to the vendor. The vendor stores the ID numbers and sends a license code to the program through the Internet. The software may be programmed to cease proper function without a verified license code from the vendor. In this case, if the software is illegally copied and installed on a different computer, the software cannot be activated since the software license is already associated with the hardware of the first installation and the license code can only be sent to the computer having the same hardware profiles stored by the vendor. However, these methods do not prevent an unauthorized party from reverse engineering the software code, modifying it to exclude these software protection tools, and mass distributing the modified software.
Many tools are in use today for software reverse engineering, like the hexadecimal dumper, which prints or displays the binary numbers of a software code in hexadecimal format. By knowing the bit patterns that represent these instructions, as well as the instruction lengths, a person who wishes to reverse engineer the software can identify certain portions of a code to see how they work, and then modify them. Another common tool for reverse engineering and code modification is the disassembler. The disassembler reads the binary code and then displays each executable instruction in a textual format. Also, since the disassembler cannot tell the difference between an executable instruction and the data used by the code, a debugger may be used. The debugger allows the disassembler to avoid disassembling the data portions of a code. For example, if the dissembler reads a command “ADD_INT8”, which means: “add the number depicted in the next 8 bits”, the debugger processes the next 8 bits as the data portion of the command “ADD_INT8”, and the next group of bits is processed as a new command. However, these tools relay on the publicized knowledge of how the instructions code is built, where the information resides in the memory, which registers are used, and how the stack (a data buffer used for storing requests that need to be handled, in the form of a push-down list) is used.
The problem of reverse engineering and code modification by unauthorized users is even more apparent when dealing with interpreter-based programming languages, as opposed to compiler based programming languages. A description of compiler-based programming language can be found in FIG. 1, which generally illustrates the prior art software process of compiler-based programming languages, such as C or Pascal. When a programmer programs in a high-level language using an editor or the like, his code's instructions 10, or source code, cannot be read directly by the computer's hardware. Therefore, the source code 10 has to undergo a translation process known as compilation by compiler 11. Compiler 11 compiles source code 10 into a specific Machine Code (MC) 12, which the computer's hardware is able to read and execute. Since the MC 12 is specifically compiled to a certain platform, it cannot be transferred from one platform to another. In the compiler-based programming languages the source code 10 is compiled for each platform individually, producing a different specific MC 12 for each platform. An example of different platforms may be an Intel® based PC with Windows® XP and Mac® OS X.
A description of interpreter-based programming language can be found in FIG. 2a which is a flow chart, generally illustrating the prior art software process of interpreter-based programming languages, such as JAVA. Similar to the compiler based programming languages, the interpreter based languages are written in high level language, using an editor or the like, referred to hereinafter as source statements 20. However, according to this approach, the compiler 21 translates the high level source statements 20 to a Byte Code (BC) 22 which is a generalized MC not limited to a certain platform. Nevertheless, in order to execute the BC 22, a specific interpreter 23 is needed to translate BC 22 into specific MC 24. The specific interpreter 23 is usually installed along with the operating system. The main advantage of this approach is that BC 22 may be distributed for different platforms. Once BC 22 is executed on a certain platform, the specific interpreter 23 translates only one BC 22 instruction at a time, producing a specific MC 24 instruction for the computer hardware to execute. However, since the interpreter method of processing is a common knowledge, it is fairly easy to read, understand, and modify the BC 22 which is an instruction set for interpreter 23. A hacker may buy a legal copy of a code written in BC, decipher its instructions and erase or modify some of the original instructions of the BC. Once the BC has been modified, it can be mass copied and resold.
Another method used by hackers is known in the art as “runtime data interception”. By intercepting and reading the data flow during the execution of a legal program by the interpreter, the hacker can simulate the process when executing an illegal program.
One method for preventing easy understanding and deciphering of the code behavior utilizes encryption of the code, as described in US 2004/0015710. According to this approach, the encrypted code is sold with a decryption key for decrypting the code. Each instruction in the code is first decrypted and interpreted by an interpreter for execution by the processor. However, once the code has been decrypted, a hacker may read the decrypted code to reverse engineer the original code. Furthermore, the decryption process may be monitored by a user for formulating the decryption key. In addition, once the code is decrypted, it is loaded unprotected into the memory of the computer and may be copied from there, as well.
Another method for preventing modification of a software code is splitting the code into 2 parts, a sensitive part comprising the code protection, and a less sensitive part. The less sensitive part of the code is sold to the user, as before, ready for interpretation, whereas the sensitive part of the code is stored on hardware products, such as smart-cards. The interpretation of the sensitive part of the code is done in hardware, such as a smart-card reader, where it cannot be monitored or read. However, in some of the cases, the additional hardware may be expensive, and redistribution of code updates generated by the provider is complicated.
A method for preventing modification of a software code is described in a paper by Enriquillo Valdez and Moti Yung “DISSECT: Distribution for SECurity Tool” (G.I.Davida and Y.rankel (Eds.):ISC 2001, LNCS 2200, pp. 125-143, 2001. Springer-Verlag Berlin Heidelberg 2001). The method suggests splitting the code into 2 parts, a sensitive part and a less sensitive part. The less sensitive part of the code is sold to the user, as before, ready for interpretation, whereas the sensitive part of the code is stored on a secured server. The interpretation of the sensitive part of the code is done on a secured server, where it cannot be monitored or read. However, this approach requires maintaining a direct contact to the designated server for executing the code.
It is therefore an object of the present invention to provide an inexpensive method for preventing software reverse engineering, unauthorized modification, and runtime data interception.
It is another object of the present invention to provide a method for preventing unauthorized modification of software, without needing additional hardware.
It is still another object of the present invention to provide a method that on one hand, prevents any modification by an unauthorized user and on the other hand, allows modification and update by the vendor.
Other objects and advantages of the invention will become apparent as the description proceeds.