An unfortunate reality of operating a computer, especially one connected to a network, is that the computer is constantly under attack. These attacks come in a variety of forms including, but not limited to, computer viruses, worms, computer exploits (i.e., abusing or misusing legitimate computer services), adware or spyware, and the like. While the mechanism of operation for each of these various computer attacks is quite distinct, in general, they are all designed to carry out some unauthorized, usually unwelcomed, often destructive, activity on the computer. For purposes of the present invention, these attacks will be generally referred to hereafter as malware.
As malware is a reality for computers generally, and for network computers in particular, various tools have been devised and deployed to prevent malware from performing its malicious intent on a computer. These tools include firewalls, proxies, and security settings on vulnerable applications. However, the most commonly used tool in protecting a computer against malware is antivirus software.
As those skilled in the art will appreciate, most antivirus software operates as a pattern recognition service. In particular, when a file is received by a computer, irrespective of whether the file is an executable, word processing document, image, or the like, the antivirus software protecting that computer “analyzes” the file to determine whether it is known to be malware. The antivirus software “analyzes” the file by generating a hash value, referred to as a signature, for the file. This signature is generated such that it is extremely unlikely that another file will have the same signature, and is therefore considered unique to that file. Once the signature is generated, the signature is then compared against other signatures of known malware in a so-called signature file. Thus, if the file's generated signature matches a signature of known malware in the signature file, the antivirus software has discovered the file to be malware and takes appropriate action.
Unfortunately, the signature recognition requires that the malware be previously known (and identified) in order to protect the computer from the malware. Thus, antivirus software is not a time-zero protection, i.e., protecting the computer from malware as soon as it is released on the network, or time-zero. Instead, a vulnerability window exists during which a new, unknown malware is released, and the time that antivirus software is able to protect a computer from the new malware.
FIG. 1 is a block diagram of an exemplary timeline 100 illustrating the vulnerability window associated with current antivirus software's signature recognition. As shown in FIG. 1, at some point in time, as indicated by event 102, a malicious party releases a new, unknown malware onto a network, such as the Internet. Obviously, once the new, unknown malware is released, computers connected to the network are at risk or vulnerable. Hence, the vulnerability window is opened.
While the actual time for detecting a new malware on a network depends on numerous factors, including the virulence of the new malware, according to available statistics, it generally takes between four hours to three days for the antivirus software community, i.e., antivirus software providers, to detect or become aware of the new malware. Once detected, as indicated by event 104, the antivirus community can begin to identify the malware. In addition to generating a signature for the new malware, identifying the malware also typically involves researching/determining the ultimate effect of the malware, determining its mode of attack, identifying system weaknesses that are exposed by the attack, and devising a plan to remove the malware from an infected computer.
After having identified the malware, which typically takes approximately four hours (at least for signature identification), an antivirus provider will post an updated signature file on its download service, as indicated by event 106. Unfortunately, computers (either automatically or at the behest of the computer user) do not immediately update their signature files. It typically takes between four hours and one week for most computers to update their signature files, as indicated by event 108. Of course, it is only after the updated signature file is downloaded onto a computer that the antivirus software can defend the computer from the new malware, thereby closing the vulnerability window 110. Indeed, depending on individual circumstances, such as when the computer owner is on vacation, updating a computer with the latest signature files can take significantly longer than one week.
As can be seen, a new, unknown malware has anywhere from several hours to several weeks to perform malicious havoc on the network community, unchecked by any antivirus software. Antivirus software is not time-zero protection. The good news is that most computers are protected before a malware tries to attack any one computer. Unfortunately, some are exposed during the vulnerability window and are infected by the malware. To most, especially those that rely heavily upon their computers, this vulnerability window is entirely unacceptable.
Those skilled in the art will readily recognize that it is important to generate a signature for a file such that the signature uniquely identifies the file that can be used to identify malware. Sophisticated algorithms and mathematics are involved with computationally generating a signature that positively identifies a file and, at the same time, does not identify any other file. Unfortunately, in order to generate a signature that uniquely identifies the file, the algorithms used are extremely sensitive to the contents of the file. Any modification to a file will cause the signature generation algorithm to generate a different signature than for the original file. In other words, a simple, cosmetic change to a known malware will cause the signature generation algorithm to return an entirely different signature. Thus, a cosmetic change to a known malware (i.e., one identified by its signature in a signature file) is usually sufficient to enable the modified malware to escape detection, at least until the modified malware has been recognized, and its signature generated and stored in a signature file.
The problem of malware generally is compounded by the fact that malware is often embedded in user modifiable files. For example, malware may be disguised in and distributed as an executable script embedded within a word processing document. In these cases, the malware portion (i.e., the embedded script) is entirely unrelated to the editable portion of the document. Thus, modifications, small or large, to the data area of the word processing document will cause the complete malware file to yield a different signature than its original, while the embedded malicious script remains unaffected. These user-modifiable files include, but are not limited to, word processing documents, spreadsheets, images, HTML documents, and the like. Furthermore, malware creators, in order to stay ahead of antivirus software detection, have begun creating self-modifying malware: documents that randomly modify some portion of the file in order to remain undetected antivirus software. Clearly, then, in many cases, it is very difficult to stay ahead of the malware that is released, especially when malware must be known in order to be stopped.
Of course, as mentioned above, newly-released malware is not always immediately identifiable by any signature. For this reason, many computer users restrict the locations that they visit on the Internet to trusted or known locations, i.e., locations with which they are reasonably confident that the available content is malware-free. In this manner, cautious users minimize their exposure to malware. Unfortunately, once a file is downloaded onto a user's computer, it is assumed that the file is safe for use (e.g., display, execution, editing, etc.) However, the mere presence of a file on a computer system does not mean that the file is safe. Just as with visiting only trusted internet locations, it would be beneficial if a user could, a priori, know the location from which certain content has been obtained. Armed with the knowledge of the content's origin, a user can be cautious with regard to acting upon the a file (e.g., executing or displaying a file, installing a module on a computer, and the like.) Accordingly, as files and/or content are obtained, they could be tagged with origin information. Still further, it would be beneficial if a computer system could identify the location from which a file or content has been obtained and act upon it according to its trustworthiness as identified in a set of predetermined rules, a white-list of trusted sites, and/or a black-list of untrustworthy sites.