Malicious programs, or malware, fall into many categories. Trojans or backdoors provide remote access to the machine across the network. Spyware attempts to log information (e.g. browsing habits, logins, passwords) and then transmit that information to a third party, who can use that information for personal gain. Other programs such as worms or viruses cause damage by self-replicating amongst machines. Malware causes considerable damage to enterprises, measured in loss of information, brand, and computer administration costs. Malware often evades detection by stealth. Malware also evades removal by actively resisting removal, or by being so complicated that it is difficult to remove all traces of the malware.
One common methodology for malware removal is signature based. Companies obtain samples of the malware, (either from their customers, or by scanning the Internet for the malware), analyze the code and generate a “signature” and a cleanup script. The signature is generally a list of unique identifiers of the executables involved, such as a hash of all or part of the executable image on disc. A hash is a compact, unique representation of the data in the file, with the property that different executables will result in different hashes. The signature can also include lists of other pieces of data associated with the malware, for example configuration files and configuration data written in the configuration database on the computer. On the Windows operating system, this database is called the registry. The registry contains a large proportion of the configuration data for the computer, for example specifying which programs survive reboot. The cleanup script is a list of processes to kill, programs to remove from the filesystem, and configuration data to delete or restore to default values. Cleaning is an example of removing.
Unfortunately, this signature-based approach has some significant problems. The first problem is that it is easy for the signature to be out of date. Malware on infected machines is often connected to the Internet and can easily update or morph itself, making the malware on the infected machine different from the sample analyzed at the security vendor's laboratory. The malware on the infected machine evades detection attempts based on the signature which was developed from the analyzed sample. This results in incomplete and unsatisfactory removal of the malware.
Second, malware is easily customized to each machine, for example by choosing a random file name. This machine-specific customization makes each individual infection different from the malware analyzed in the laboratory, and again results in incomplete and unsatisfactory removal.
Third, malware programs can actively resist removal, for example by having two programs that watch and restart each other should one be killed, or by loading themselves into system processes, thereby locking the executable so that the executable cannot be deleted. In another example, the configuration data is automatically monitored by malware, which rewrites any changes made to the configuration data. The removal program has to run with sufficient privileges to guarantee control over user mode process and provide an avenue to deal with kernel mode malware, preferably in the kernel of the operating system. The kernel is the part of the operating system that controls access to system resources and schedules running processes, and thus has total control of the computer. Most signature based schemes are not implemented in the kernel, as the scanning signature based schemes require to match signatures with the data on the computer would not be very efficient to implement in the kernel. A consequence of this is that signature based schemes are not in a good position to remove recalcitrant malware, again resulting in incomplete and unsatisfactory removal.
Other solutions to malware removal are based on the idea of “undo”, whereby the actions of untrusted programs are recorded, and can be “undone” to restore the system to a clean state. Because it is difficult to know which programs to trust, “undo” systems tend to end up recording all the actions of the majority of the programs on the system, and thus require large amounts of storage. “Undo” schemes also fail to address the problem of identifying all portions of the malware.