1. Field of the Invention
This invention pertains in general to detecting computer viruses and in particular to detecting polymorphic computer viruses.
2. Background Art
Modern computer systems are under constant threat of attack from computer viruses and other malicious code. Viruses often spread through the traditional route: a computer user inserts a disk or other medium infected with a virus into a computer system. The virus infects the computer system when data on the disk are accessed.
Viruses also spread through new routes. A greater number of computer systems are connected to the Internet and other communications networks than ever before. These networks allow a networked computer to access a wide range of programs and data, but also provide a multitude of new avenues by which a computer virus can infect the computer. For example, a virus can be downloaded to a computer as an executable program, as an email attachment, as malicious code on a web page, etc. Accordingly, it is common practice to install anti-virus software on computer systems in order to detect the presence of viruses.
Simple computer viruses work by copying exact duplicates of themselves to each executable program file they infect. When an infected program is executed, the simple virus gains control of the computer system and attempts to infect other files. If the virus locates a target executable file for infection, it copies itself byte-for-byte to the target executable file. Because this type of virus replicates an identical copy of itself each time it infects a new file, the anti-virus software can detect the virus quite easily by scanning the file for a specific string of bytes (i.e. a “signature”) characteristic of the virus.
The designers of computer viruses are constantly evolving new techniques for eluding the anti-virus software. Encrypted viruses are examples of one such technique. Encrypted viruses include a decryption routine (also known as a “decryption loop”) and an encrypted viral body. When a file infected with an encrypted virus executes, the decryption routine gains control of the computer and decrypts the encrypted viral body. The decryption routine then transfers control to the decrypted viral body, which is capable of spreading the virus. The virus spreads by copying the identical decryption routine and the encrypted viral body to the target executable file. Although the viral body is encrypted and thus hidden from view, anti-virus software can detect these viruses by searching for a signature in the unchanging decryption routine.
A polymorphic encrypted virus (“polymorphic virus”) includes a decryption routine and an encrypted viral body. The viral body includes a static portion and a machine-code generator often referred to as a “mutation engine.” The operation of a polymorphic virus is similar to the operation of an encrypted virus, except that the polymorphic virus generates a new decryption routine each time it infects a file. Many polymorphic viruses use decryption routines that are functionally the same for all infected files, but have different sequences of instructions.
These multifarious mutations allow each decryption routine to have a different signature. Therefore, anti-virus software cannot detect polymorphic viruses by simply searching for a signature from a decryption routine. Instead, the software loads a possibly-infected program into a software-based CPU emulator acting as a simulated virtual computer. The program is allowed to execute freely within this virtual computer. If the program does in fact contain a polymorphic virus, the decryption routine is allowed to decrypt the viral body. The anti-virus software detects the virus by searching through the virtual memory of the virtual computer for a signature from the decrypted viral body.
Virus creators have developed several techniques for attempting to defeat emulator-based virus detection. First, virus creators have produced “metamorphic” viruses that are not necessarily encrypted, but vary the instructions in the viral body with each infection. The varying instructions make it difficult to detect the viruses using signature scanning. Second, virus creators have produced decryption engines that utilize CPU instructions that are not emulated by the emulator, which causes the virus to not decrypt its viral body and signature scanning to fail. Third, virus makers have created entry point obscuring viruses that make it difficult to determine where in a file the viral code is resident, thereby making it difficult to determine what instructions to emulate in order to decrypt the viral body.
Therefore, there is a need in the art for a technique that can reliably detect viruses having non-emulated instructions and/or obscured entry points.