1. Field of the Invention
The present invention relates to systems for detecting malicious computer software. More specifically, the present invention relates to a method and an apparatus for detecting malicious software by analyzing patterns of system calls generated by the software during emulation.
2. Related Art
Malicious software can enter a computer system in a number of ways. It can be introduced on a disk or a CD-ROM that is inserted into the computer system. It can also enter from a computer network, for example, in an email message.
If malicious software is executed by a computer system, it can, cause a number of problems. The software can compromise security, for example, by stealing passwords, by creating a xe2x80x9cback doorxe2x80x9d into the computer system, or by otherwise accessing sensitive information. The software can cause damage to the computer system, for example, by deleting files or by causing the computer system to fail.
Some types of malicious programs can be easily detected using simple detection techniques, such as scanning for a search string. However, this type of detection process can be easily subverted by converting a malicious algorithm into program code in different ways. Furthermore, since most malicious software programs are written in a high-level language, it is hard to analyze these programs because much of the code within them is taken from standard code libraries.
At present, a malicious program is typically analyzed manually by a human expert, who runs the program and observes the results to see if the program exhibits malicious behavior.
A human expert can also decompile the program and remove library code, which enables the human expert to more easily examine the algorithm. In examining the algorithm, the human expert typically pays special attention to system calls (or application program interface (API)) calls) that interact with the computer system and the outside world to determine whether the system calls indicate that the program is likely to exhibit malicious behavior.
Yet another approach is to run a program on a real machine while attempting to intercept malicious actions. This technique, which is known as xe2x80x9cbehavior blocking,xe2x80x9d has a number of disadvantages. In spite of the attempt to intercept malicious actions, the program may nevertheless cause harm to the computer system. Furthermore, the behavior blocking mechanism typically cannot view an entire log of actions in making a blocking determination. Hence, the behavior blocking mechanism may make sub-optimal blocking decisions, which means harmless programs may be blocked or harmful programs may be allowed to execute.
What is needed is a method and an apparatus that detects malicious software without requiring manual analysis of the software by a human expert, and without exposing the computer system to potentially malicious actions of the software.
One embodiment of the present invention provides a system for determining whether software is likely to exhibit malicious behavior by analyzing patterns of system calls made during emulation of the software. The system operates by emulating the software within an insulated environment in a computer system so that the computer system is insulated from malicious actions of the software. During the emulation process, the system records a pattern of system calls directed to an operating system of the computer system. The system compares the pattern of system calls against a database containing suspect patterns of system calls. Based upon this comparison, the system determines whether the software is likely to exhibit malicious behavior.
In one embodiment of the present invention, if the software is determined to be likely to exhibit malicious behavior, the system reports this fact to a user of the computer system.
In one embodiment of the present invention, the process of comparing the pattern of system calls is performed on-the-fly as the emulation generates system calls.
In one embodiment of the present invention, the system emulates the generation of results for system calls, so that the emulation accurately follows an actual execution path through the software.
In one embodiment of the present invention, the software is received on a computer-readable storage medium.
In one embodiment of the present invention, the software is received across a network.
In one embodiment of the present invention, recording the pattern of system calls includes recording parameters of individual system calls within the pattern of system calls.
In one embodiment of the present invention, the system terminates analysis of the software if: a maximum number of instructions are executed during the emulation; a maximum number of system calls are made during the emulation; the emulation completes; or the pattern of system calls is determined to exhibit malicious behavior.
In one embodiment of the present invention, comparing the pattern of system calls includes computing a function of the pattern of system calls.
The present invention provides a number of advantages. Because the system analyzes patterns of system calls, it does not depend upon the specific properties of a high-level language compiler or the geometry of an executable file. (2) A malicious program does not have to be run directly on the computer system in order to detect the potentially malicious activity. Hence, the emulation process is insulated from the host computer system. (3) The system can determine whether a program exhibits malicious behavior based upon patterns Within an entire log of system calls, not just an individual system call. (4) The present invention also generates fewer false alarms because only actually executed code is analyzed. This eliminates the problem of detecting suspicious code fragments which may never be executed.