1. Field of the Invention
This invention pertains in general to detecting viruses within files in digital computers and more particularly to determining the number of instructions to emulate in order to decrypt and/or detect a virus.
2. Background Art
Simple computer viruses work by copying exact duplicates of themselves to each executable program file they infect. When an infected program executes, the simple virus gains control of the computer and attempts to infect other files. If the virus locates a target executable file for infection, it copies itself byte-for-byte to the target executable file. Because this type of virus replicates an identical copy of itself each time it infects a new file, the simple virus can be easily detected by searching in files for a specific string of bytes (i.e. a “signature”) that has been extracted from the virus.
Encrypted viruses comprise a decryption routine (also known as a decryption loop) and an encrypted viral body. When a program file infected with an encrypted virus executes, the decryption routine gains control of the computer and decrypts the encrypted viral body. The decryption routine then transfers control to the decrypted viral body, which is capable of spreading the virus. The virus is spread by copying the identical decryption routine and the encrypted viral body to the target executable file. Although the viral body is encrypted and thus hidden from view, these viruses can be detected by searching for a signature from the unchanging decryption routine.
Polymorphic encrypted viruses (“polymorphic viruses”) comprise a decryption routine and an encrypted viral body which includes a static viral body and a machine-code generator often referred to as a “mutation engine.” The operation of a polymorphic virus is similar to the operation of an encrypted virus, except that the polymorphic virus generates a new decryption routing each time it infects a file. Many polymorphic viruses use decryption routines that are functionally the same for all infected files, but have different sequences of instructions.
These multifarious mutations allow each decryption routine to have a different signature. Therefore, polymorphic viruses cannot be detected by simply searching for a signature from a decryption routine. Instead, antivirus software uses emulator-based antivirus technology that loads the program into a software-based CPU emulator which acts as a simulated virtual computer. The program is allowed to execute freely within this virtual computer. If the program does in fact contain a polymorphic virus, the decryption routine is allowed to decrypt the viral body. The virus detection engine can then detect the virus by searching through the virtual memory of the virtual computer for a signature from the decrypted viral body.
Metamorphic viruses are not encrypted but vary the instructions in the viral body with each infection of a host file. Accordingly, metamorphic viruses often cannot be detected with a string search because they do not have static strings.
When detecting a virus through emulation, the antivirus engineers must tweak the emulation parameters in such a way that the virus detection engine slows down for all files and not just for possible virus host files. For example, assume that the emulation control module must detect a memory modification at least once every 500 instructions or it will assume it is emulating a non-virus and abort emulation. If a new virus were to be developed which decrypted a byte (i.e., performed a memory fetch and store) every 1000 instructions, the engineers would need to change the emulator's modification detection constant from 500 to 1000 instructions. This change means that all clean programs, as well as infected program, would need to be emulated for 1000 instructions before the programs could be excluded as not infected. Many recent viruses try to escape detection by increasing the number of dummy instructions between decryption instructions and, as a result, the virus detection engine must emulate a large number of instructions for each file. Accordingly, emulation-based virus detection schemes can be quite slow when analyzing files.
Accordingly, there is a need in the art for a virus detection scheme that can detect the presence of viruses through emulation but does not suffer the performance drawbacks of prior art virus detection schemes.