1. Field of the Invention
The present invention relates to a method of analyzing and decrypting encrypted malicious scripts, and more particularly, to a technology for flexibly coping with a new encryption scheme through an analytical approach to a conventional script encryption scheme.
2. Description of the Related Art
In general, encryption means a process or technique of encoding messages so that the meanings of the messages are not revealed. However, encryption in computer viruses or malicious codes means a technique for hiding signatures of malicious codes from a virus scanner by scrambling the malicious codes. A signature, which is a short character string present only in a specific malicious code but not present in other programs, is used for distinguishing the specific malicious code from legitimate programs and identifying the kind of malicious code. Since a malicious code detection system using the signature is relatively fast as compared to other techniques, most of the existing anti-virus products have generally employed such a signature-based detection system augmented with some heuristic algorithms.
By the way, to avoid such a signature-based detection system, malicious code creators add separate encryption functions to viruses. In general, an encrypted malicious code consists of a decryption routine, a key value, and an encrypted malicious code. Further, the decryption routine is performed first when the system is executed. Therefore, the decryption routine decrypts the encrypted malicious code and passes a control to the malicious code decrypted such that the malicious code can be executed. This allows the malicious code not to be detected by simple scanning since the malicious code becomes a completely different code only by encoding itself using a new key value when making an attempt at self-replication in the other systems or files.
On the other hand, X-raying and emulation techniques are used to cope with such an encrypted malicious code. The X-raying technique attempts all cases (Brute-force decryption) after narrowing a detection range by using known information about the signature discovered from the relevant malicious code and the decryption algorithm used by the malicious code. In other words, in a case where all the information about the encryption technique and the signature of the relevant malicious code is known but only a precise key value is unknown, the character string at a position where the signature can appear is decrypted by using all possible key values and it is then checked whether the decrypted character string has the same value as the signature so that it can be determined whether there are any malicious codes. However, there is a disadvantage in that it is difficult to apply the X-raying technique to a new unknown malicious code, because it is feasible after sufficient known information has been obtained by thoroughly analyzing an encryption scheme and properties of a malicious code to be searched for.
The emulation technique obtains a code decrypted by emulating a malicious code. In a binary malicious code, a decrypted malicious code can be obtained by executing a portion of a relevant code in a virtual machine, because a decryption routine is first executed and is very small in size. At this time, if all decrypted codes are to be obtained, each memory unit in the virtual machine should be monitored and the execution should continue until values in a memory of a code portion are not further varied. In addition, if the emulation technique is used combined with a signature-based detection method, the emulation stops immediately after decryption for values of memory in which the signature resides has been completed, and signature comparison is then performed. Although the emulation technique for completely (unconditionally) emulating the malicious code up to the specific point of time is effective to decrypt a malicious code in a binary file format, it is difficult to construct an emulator for scripts as compared to a binary execution file. In other words, in order to achieve complete emulation, all the possible environments where an object code can be executed must be virtually created. However, it is realistically difficult to emulate a variety of objects and environments used in a relevant program in case of a script language such as a Microsoft Visual Basic Script, and a large load is required in such a case. In addition, contrary to general codes that do no harm, a method of profiling execution details through a simple execution of code cannot be used for a malicious code.
In conclusion, since the aforementioned methods are either applicable only to a case where the properties and behaviors of a relevant malicious code are known or suitable for decrypting the malicious code in a binary file format, it is difficult to apply them to unknown encrypted scripts. Therefore, a heuristic-based methodology in which pattern of an encryption technique and its decryption method used in the conventional script malicious codes are defined and used is regarded as a most realistic decryption technique for the script malicious codes. For example, the conventional various visual basic script malicious codes are configured in such a manner that an actual malicious code is encrypted into one character string and is executed through an ‘execute’ sentence defined in a script language. In this case, a decrypted malicious code can be obtained by regarding a function, which is called from the ‘execute’ sentence found in a given script, as a decryption function and executing or emulating this function. This type of decryption function consists of only a BASIC language structure that does not use all the aforementioned objects and environments, and it is executed only once in the head of program. Therefore, since this type of decryption function can be executed by only a light-weighted emulator with a basic function without requiring the complete emulation as mentioned above, burden for the emulation and the emulator construction is not serious.
According to a heuristic-based approach, however, it is required to add a code capable of dealing with new encryption patterns to a virus scanner whenever the new encryption patterns appear. Thus, there is an essential problem in that it is difficult to smoothly cope with unknown malicious scripts. Particularly, it is difficult to smoothly cope with a partial encryption in the unit of character string present uniquely in the scrip malicious codes.