As the Internet continues to expand in terms of both connectivity and number of users, the amount of malicious software (“malware”) existing across the Internet continues to increase at a significant rate. Malware, in the form of, for example, viruses, spyware, and worms, is essentially software code written to infiltrate and/or damage a computer system. In some worst case scenarios, malware can destroy important data, render a computer system virtually useless, and/or bring down a network of hundreds or thousands of computer systems. Recovering a computer system or network from a successful malware attack often requires considerable resources. Further, malware, while typically attacking computer systems connected to the Internet, can also spread from one computer system to the other by, for example, a non-Internet based file transfer between computer systems.
In an effort to protect computer systems against malware, various companies design and offer anti-malware programs (also referred to herein as “security software”) (e.g., Norton Antivirus™ by Symantec Corporation). Generally, anti-malware programs use “signatures” and “heuristics” to detect malware. A signature of a particular type of malware is the binary pattern of the malware. Anti-malware programs rely on signatures to detect and identify specific malware. Stored signatures must be kept up-to-date in order for anti-malware programs to remain effective as malware evolves over time. The reliance of anti-malware programs on heuristics involves detecting behaviors that indicate the presence of malware. The behavior could be based on code that is running or on code patterns in files.
Anti-malware programs detect malware by scanning one or more various locations where malware may reside. At a minimum, a typical anti-malware program is capable of scanning the files stored on a hard disk of a computer system. However, as hard disk sizes continue to increase, the amount of time needed to scan files on the hard disk commensurately increases. Thus, at least partly for this reason, there is a need to improve file scanning operations.
Further, those skilled in the art will note that there are behavioral differences between when a file is executed under a non-scanning situation and when the file is scanned for malware. For normal execution of a file, an execution thread accesses the file at an “entry point” (typically, the first code portion of the file), which is in page n of the file. The order of accesses to the file's pages is then determined by the flow of the code (e.g., the first instruction at the entry point may invoke code that is in any other page). There are known technologies for optimizing file execution based on code execution flow through the file. However, such technologies are not necessarily optimized for malware scanning. During a scanning operation, an anti-malware program controls the execution flow through the file in order to best detect malware based on known malware signatures and heuristics. In other words, anti-malware programs scan a file by deterministically scanning portions of the file known to be susceptible to malware attacks. Such flow is typically different than execution flow when the file is executed in a non-scanning scenario. For example, the anti-malware program may first scan first and third code portions of the file even though normal execution of the file would result in execution flow from the first code portion to the second code portion of the file. Thus, there is a further need to improve file scanning operations in view of the fact that technology for optimizing execution of a file is not necessarily optimal for malware scanning of that file.