It is known in the art that each day, many tens of thousands of new malicious or otherwise undesirable software programs are discovered. These programs can compromise the security of general computing devices. Possible security violations include, but are not limited to, the theft of data from the system, the usurping of the system for other nefarious purpose (like sending spam email), and, in general, the remote control of the system for other malicious actions.
One popular technique in the art for detecting malicious software comprises the following steps:                a. Establishing through some independent means that the application is malicious (e.g., by manually analyzing it). This step is typically carried out by a vendor of anti-malware technology.        b. Constructing a signature for this piece of software. A signature comprises a set of characteristics that can be used to identify that piece of software (and pieces of software that are related to it). One example of a signature is a cryptographic hash or fingerprint. A hash is a mathematical transformation that takes the underlying binary contents of a software application and produces a relatively short string, with the idea being that two different applications will, with overwhelmingly high probability, have distinct fingerprint values. Common functions for performing this fingerprinting or hashing step include SHA-256, SHA-1, MD5, and others. A signature can also include a set of strings that are contained in the file in question.        c. Publishing this signature so that it is accessible to end-users operating a general purpose computing device.        d. Having the device cross reference the files it contains against the published signatures to determine if there is a match.        e. Applying a set of steps or a given policy if the fingerprints match (e.g., blocking the installation of the application, removing it from the system if it is already installed, etc.).        f. The above technique is geared towards situations when the signature was known ahead of time (i.e., before an actual piece of malicious or unwanted software arrived on an actual end-user system). In some cases, a piece of malware may have already infiltrated a system, and only subsequent to its infiltration will there be new evidence to suggest that the file was malicious.        
Aside from that, an anti-malware vendor might initially deem a software application to be malicious, but later garner new intelligence to determine that the application was, in fact, clean (i.e., this determination was made in error and the particular application is actually benign). Even if a vendor has such new intelligence, it would need to cross reference that intelligence against all the files that it knows about to identify the files on which an error was made. Then there is no easy way for the vendor to retroactively undo its mistakes on end user systems without forcing users to scan their entire system for threats or clean files each time new intelligence on threats or clean files is discovered. Such an approach is prohibitively expensive, especially considering the large number of files on a given end-user system as well as the rate at which new intelligence can be gathered.
There is, accordingly, a need in the art to develop methods, components, and systems for intelligently rescanning the files a vendor knows about to identify if any of them are potentially malware (or can be determined to be conclusively clean). The naïve approach is to cross reference every file against every known signature. This approach is, however, expensive to carry out since a vendor might have a copious number of files and large amount of file data. Instead, one improved approach would be to identify a subset of files that were initially marked as non-malicious, but now appear to have a higher propensity of being malicious, thereby making them good candidates for re-examining. Along these lines, analogous methods can be applied to files that were initially deemed malicious, but now appear to have a higher propensity of actually being benign.