Computer viruses have been detected using a technique known as integrity checking. In integrity checking, the antivirus scanner computes hashes for all executable files on a computer, and stores the hashes in a database. Should any of those files change, their hash will change too, and the integrity checker alerts the user. Such an integrity checking scheme can detect a very high number of new viruses. Unfortunately, the integrity checker is extremely susceptible to false positives. For example, if the user legitimately modifies an executable file to a new version, or if the file has been infected with a virus and then repaired, the integrity checking software may very well declare the presence of a virus even though no virus is actually present. The present invention is able to avoid these types of false positives.
A second problem with integrity checking is that the database of integrity (hash) data must be built ab initio on the user's computer. If the computer is already infected when the integrity checker is first used, the integrity data will contain hashes of infected files. This is a false negative problem, since the integrity checker is unable to detect this type of existing infection on the computer. It can detect only new infections introduced after the initial integrity database has been built. In one embodiment of the present invention, a database, e.g., a typical antivirus definition file, is pre-populated with version and hash information and then shipped to the user. Consequently, existing infections as well as future infections on the user's computer can be detected. This is another benefit of the present invention over the integrity checking method.
Microsoft Corporation has introduced a feature on newer versions of its Windows operating systems called System File Protection (SFP). SFP is used to prevent the user, when installing a new program, from overriding protected system files with updated copies that will cause stability problems on the computer. SFP is described in “System File Protection and Windows Me”, “http://www.microsoft.com/hwdev/archive/sfp/WinME_sfpP.asp”, last updated Dec. 4, 2001, downloaded from the Internet Apr. 9, 2002; “Description of the Windows 2000 Windows File Protection Feature (Q222193)”, “http://support.microsoft.com/default.aspx?scid=kb;EN-US;q222193”, first published May 26, 1999, last modified Jan. 12, 2002, downloaded from the Internet Apr. 9, 2002; and “Software: Windows ME; Windows ME and System File Protection”, “http://www.wackyb.co.nz/mesfp.html”, last updated Mar. 11, 2002, downloaded from the Internet Apr. 9, 2002.
SFP checks the new file's version number to ensure that it is not older than the previous version of the file that is to be overwritten. If the version number of the new file is newer or the same vintage as the version number of the original file, the original file is replaced. This is done to ensure that a lot of old DLLs (Dynamic Link Libraries) are not clogging up the computer. SFP also checks for inadvertent system file deletions, and replaces any protected system files that have been improperly deleted. Finally, SFP checks the file's version number and a full hash of the file to determine that the file has not been corrupted. If the version is recognized but the file's contents have changed in any way, the entire file is replaced from a backup copy. The user is not alerted.
SFP differs from the present invention in that:
1) The stated intent of SFP is to address the problem of files that have been corrupted, not files that have been infected by viruses;
2) SFP generates a full file hash that is variant with respect to repair (that is, tracked files that have been corrupted and then repaired may still cause the system to trigger and identify the file as corrupted, even if the file was properly repaired), while the present invention uses a minimal code hash that is invariant to repair;
3) SFP uses just the version information to track files, while the present invention uses other infection-invariant components of the file, such as hashes of the data segment; and
4) SFP works on an individual user's computer, whereas the present invention can be used on any computer, such as an e-mail gateway, a file server, a desktop computer, or a back end infrastructure computer (a computer at the headquarters facility of an antivirus software company where suspicious viruses are tested).