1. Field of the Invention
This invention relates to systems and methods for repairing files that have been infected with computer viruses, and more particularly to systems and methods that repair infected files using a combination of generic and virus-specific repair routines.
2. Background Art
The infection of various boot objects and executable files with software viruses is a well known and growing problem. A major objective of anti-virus efforts is the detection of infected files, and a number of different systems, including the above-referenced Polymorphic Virus Detection Module, have been developed for this purpose. Virus detection limits the scope of damage to uninfected parts of the system but it is only one of the objects of anti-virus programmers. Another objective is to repair those files already infected with the detected virus (host files). This requires excising the viral code from the stored version of the host file and restoring to the host file any code that may have been modified by the virus.
Most computer viruses employ one of several well-known methods to infect host files and boot objects. Typically, the virus appends or prepends itself to the host file and replaces commands at the entry point of the host file (host bytes) with viral code that passes control of the processor to the virus when the host file is executed. Another group of viruses simply overwrite critical code (host bytes) at the host file entry point. In almost all cases, the virus saves the host bytes that it overwrites so the host bytes can be restored to the host file image in memory following execution of the viral code. Control of the processor is then returned to the host file image in memory, allowing it to execute normally. Without this last step, the processor would likely crash, calling attention to the fact that there may be some problem with the host file. Such early detection would severely limit the damage a virus could inflict on a computer system. Consequently, virus designers invest much effort in hiding the presence of their viruses.
For host files infected by standard strategies, repair can often be accomplished using a small library of anti-virus repair functions provided the host bytes can be located and restored. For simple, unencrypted viruses, table-based programs comprising entries for known viruses are usually sufficient. Each entry is associated with a repair routine that includes the storage locations of the host bytes in the virus and subroutines for restoring the host bytes to their proper location in the host file. Such table based methods work only with viruses that are identical in each instance of infection and employ standard infection strategies. The "Thunderbyte Anti-Virus" employs a repair system that steps through the viral code, one instruction at a time, evaluates each instruction, intercepts those instructions that appear likely to damage the computer system, and allows all other instructions to execute. This system is designed to allow the virus' own repair code to execute and restore the host bytes to their proper location in the host file.
New infection techniques and virus types have made these known repair systems increasingly unreliable. For example, once the Thunderbyte Anti-Virus system became known to virus designers, they devised ways to make their computer system-damaging instructions appear innocuous. Thunderbyte executes these seemingly innocuous instructions, facilitating infection of the system memory and files it was designed to protect. Table-based repair schemes are useless with polymorphic viruses, which employ complex encryption schemes to conceal a static virus body within a polymorphic decryption loop (PDL). The PDL appears different in each instance of infection, making the fixed repair routines of table-based repair schemes useless. When the host file is run, control passes to the polymorphic virus which decrypts itself until the static virus body is regenerated and then does its damage. As with most viruses, polymorphic viruses store overwritten host bytes and include virus repair routines that restore the saved host bytes and pass control to the host file once the viral code has run. With polymorphic viruses, however, the host bytes can not be accessed without first decrypting the PDL.
At least one polymorphic virus employs a new infection strategy in which control passing instruction packets are inserted in various locations throughout the host file rather than just at its entry point In this case, a series of overwritten host bytes must be located in the encrypted virus and restored to the host.
New repair schemes specifically designed to deal with polymorphic viruses include CPU emulators which allow the polymorphic virus to decrypt itself without exposing the computer's memory and files to infection by the virus. Once the virus is decrypted, its type is identified from its static virus body and the repair system tries to locate the host bytes and restore them to their proper location in the host file. These emulation based repair systems employ generic subroutines described above for repairing decrypted files, and at least one of these systems employs macros for identifying and dealing with viruses that deviate from the standard infection schemes. However, the macro language has only moderate functionality and is inadequate for repairing complex viruses.