1. Field of the Invention
The present invention relates to a method for detecting malicious code patterns present in malicious scripts, and more particularly, to a method for detecting malicious code patterns by using a static analysis in consideration of control and data flows.
2. Description of the Prior Art
Generally, as for detection of malicious scripts, techniques for binary codes are used directly or some modifications can be made thereto to be adapted to scripts in the form of source programs. Particularly, signature recognition through scanning is a malicious-code detection scheme that is being most commonly used. This scheme has an advantage in that a diagnosis speed is high and the kind of malicious code can be clearly identified since it determines whether a concerned code is a malicious code by searching for a special character string present in only one malicious code. However, this scheme has a problem in that it cannot cope with unknown malicious codes at all.
Meanwhile, it is a heuristic analytical technique that is considered the most practical one among techniques for detecting unknown malicious scripts. This technique is a scheme for detecting malicious codes by organizing code segments frequently used as malicious codes into a database and scanning target codes to determine whether the target codes are present or how many times the target codes appear. Although this scheme has advantages of a relatively high speed and a high detection ratio, it has a disadvantage of a somewhat high possibility of the occurrence of a false positive error. Accordingly, in order to alleviate such a disadvantage, there has been proposed a method for detecting malicious scripts using a static analysis. Since this method checks not only the presence of method sequences but also associated parameters and return values, it represents considerably precise detection results as compared with a method using a simple heuristic analysis.
FIG. 1 shows an example of malicious visual basic script codes, explaining the concept of a method for detecting malicious scripts using a static analysis. As can be seen in FIG. 1, in order that a plurality of method calls constitutes one malicious behavior, a special relationship between their parameters and return values is inevitably required. For example, Copy method at the fourth line copies a script under execution to make a file with the name of “LOVE-LETTER-FOR-YOU.TXT.VBS” and “Attachments.Add” method at the seventh line attaches the file to a newly created mail object to achieve self-replication through a mail. However, in a case where a scheme for checking only the presence of method calls is used, even though there is an unrelated method call for creating a script file named “A” and attaching a file named “B” to the created script file, the method call is regarded as a malicious code, which exhibits a high false positive error rate. On the contrary, the detection method using the static analysis can obtain more precise detection results than methods using a simple search of character strings, by checking not only whether method calls are present but also whether all relevant values such as used file names, for example, “fso,” “c,” “out,” “male” or the like, are found.
However, the detection method using the static analysis has still a problem in view of detection accuracy. Conventional detection methods using the static analysis compare only revealed names of variables with each other. Therefore, there may be an error that only for the reasons that given two variables have the same name, the values of the two variables are regarded as the same even during execution. FIG. 2 shows an example in which a false positive error may occur in the detection method using the static analysis. In the conventional detection methods using the static analysis, it is only confirmed that variables “c” used at the first and fourth lines are same, and the values of the two variables are regarded as being identical to each other. However, when the program is analyzed, it can be seen that since variable “c” is newly defined at the third line, variables “c” at the first and fourth lines have different values upon actual execution, respectively. Contrary to FIG. 2, FIG. 3 shows an example in which a false negative error may occur in the detection methods using the static analysis. In the conventional methods using the static analysis, since variable “c” at the first line and variable “d” at the third line are different variables, it is determined that the values of the two variables are not same. However, the values of the two variables become identical to each other due to a replication statement “d=c” at the second row upon actual execution. Consequently, in view of the detection of entire malicious behavior patterns, the two types of errors mentioned above induce the false positive and negative errors, respectively. Therefore, there is a need for a method for solving these errors.