1. Field of the Invention
This invention relates to data processing systems. More particularly, this invention relates to malware detection within data processing systems, such as, for example, detecting computer viruses, computer worms, computer Trojans, banned computer files and the like.
2. Description of the Prior Art
The threat posed by malware, such as computer viruses, is well known and is growing. Computer viruses are becoming more common, more sophisticated and harder to detect and counteract. Computer systems and software for counteracting malware typically operate by seeking to identify characteristics of known malware within a computer file being checked. A malware signature file typically contains data for identifying many thousands of different types of computer virus, Trojan, worm etc., as well as some characteristics generally indicative of malware and against which a computer file will need to be checked. With the rapid increase in the number, complexity and size of computer files present on a computer and requiring checking, the amount of processing required and accordingly time needed to conduct malware detection is disadvantageously increasing. In the case of an on-access scan which is performed before access is allowed to a computer file, the delay introduced by first scanning that computer file for the presence of malware can introduce a noticeable and disadvantageous delay in the responsiveness of the computer system. In the case of an on-demand scan where the entire contents of a computer are checked for malware, this check can take many minutes to perform and render the computer unusable for other purposes during this time.
One technique for speeding up malware detection that has previously been used is only to scan types of file which are executable. Potentially executable file types were previously restricted to relatively few types, such as EXE file types and COM file types. However, with the advent of more complex files and structures within files, it is now difficult to safely assume that a particular file type cannot contain any executable content and accordingly cannot contain malware. Furthermore, as well as requiring a larger number of types of file to be subject to scanning, if not all file types, the increased complexity of the structures within files results in more processing being required to unpack and unravel those structures in order to effectively detect any malware which may be present within those computer files.
It is known from U.S. Pat. No. 6,021,510 to provide an anti-virus accelerator which when a file is examined for an initial time and found to be clean, then a hash value for each scanned sector for that file can be stored. Upon a subsequent attempt to scan that file, the file sectors which were examined in the initial scan can be examined again and their hash values recalculated and compared with the stored hash values. If the hash values match, then the sector can be considered to be unaltered and still clean.
The paper “A Cryptographic Checksum For Integrity Protection” published in Computers & Security, Volume 6, 1987, pages 505-510 by F. Cohen describes a cryptographic checksum technique for verifying the integrity of information in a computer system with no built in protection.