1. Field of the Invention
The present invention is related to anti-malware technology, and more particularly, to testing signatures for anti-malware processing.
2. Description of the Related Art
Detection of viruses and malware has been a concern throughout the era of the personal computer. With the growth of communication networks such as the Internet and increasing interchange of data, including the rapid growth in the use of e-mail for communications, the infection of computers through communications or file exchanges is an increasingly significant consideration. Infections take various forms but are typically related to computer viruses, Trojan programs or other forms of malicious code (i.e., malware).
Recent incidents of e-mail mediated virus attacks have been dramatic both for the speed of propagation and for the extent of damage, with Internet service providers (ISPs) and companies suffering service problems and a loss of e-mail capability. In many instances, attempts to adequately prevent file exchange or e-mail mediated infections significantly inconvenience computer users. Hence, improved strategies for detecting and dealing with virus attacks are desired.
A conventional approach to detecting viruses is signature scanning. Signature scanning systems use sample code patterns extracted from the known malware code and scan for the occurrence of these patterns in other program code. A primary limitation of the signature scanning method is that only known malicious code is detected, that is, only the code that matches the stored sample signatures of the known malicious code is identified as being infected. All viruses or a malicious code previously non-identified, and all viruses or a malicious code created after the latest update of the signature database will not be detected.
In addition, the signature analysis fails to identify the presence of a virus if the signature is not aligned in the code as expected. Alternatively, the authors of a virus may obscure the identity of the virus by an opcode substitution or by inserting dummy or random code into the virus functions. A nonsense code can be inserted that alters the signature of the virus to a sufficient extent so as to become undetectable by a signature scanning program, without diminishing the ability of the virus to propagate and deliver its payload.
Another problem related to use of signatures for malware detection is that the signatures need to be tested. Generating a signature requires a calculation employing a cryptographic algorithm (typically, the MD5 algorithm). Generating a signature using MD5 for a large file is a computational intensive task requiring a lot of system resources. This problem is overcome by using the key parts of the file and calculating a control value (CRC) for producing the file signature.
The key parts of a file can be a file size, check sum of a file header, check sum of the first and last code sections. A size and a checksum of an overlay of the file can also be used. The file overlay is a data added to the bottom of the file and not described in PE format header. The key portions of a typical file are illustrated in FIG. 1.
A conventional method of using the signatures is depicted in FIG. 2. Updates for AV database 210 are released in step 220. The updates are tested in step 230. Errors are corrected in step 240. Updates are released as a final version in step 250. Possible errors are analyzed in step 260. The process depicted in FIG. 2 takes several hours and requires a lot of resources for testing updates for collisions among terabytes of data produced during the anti-virus (AV) processing. Potentially the amount of data can be on the order of petabytes.
Typically the AV processing is limited in time, since the updates must be released at least hourly. Thus, it is impossible to test the updates against all AV data. Therefore, only the marked portion 211 of the AV database 210 is used for testing purposes. Consequently, even after the errors are corrected and the updates are released, the probability of collisions remains high, especially collisions can occur with applications that are not contained in the AV database 210.
An effective conventional approach of malware detection uses the so-called white lists—the lists of signatures of known “clean” objects. In order to compare a suspect object against the white list, object signatures are generated and used. For efficiency, the white lists have to be constantly updated.
When white lists are used, some false positive determinations are inevitably made. It is important to detect false positives, as they can cause almost as much harm as a malware. For example, a legitimate component can be “recognized” by the AV to be malware, causing severe damage to the reputation of the AV software vendor, and annoyance and wasted time for many users.
Another scenario develops when a malware is mistakenly considered to be a “clean” component and harm a system. Currently, when false positives are detected, signature testing is performed in order to correct white lists and to avoid false positives in the future. However, signature testing is time consuming. By the time the signatures are tested and the white list is updated, some undetected malware can have caused harm on the affected systems.
U.S. Pat. No. 7,231,637 discloses distributing a pre-release scanner updates from the server to the network computers. However, signature testing is not disclosed. U.S. Pat. No. 7,334,005 also discusses providing security updates to users, but it does not use signatures.
It is apparent that improved techniques for testing signatures are desired. Accordingly, there is a need in the art for a method that addresses the need for providing the signatures to users for effective anti-malware processing.