Consumers and businesses increasingly rely on computers to store sensitive data. Consequently, malicious programmers seem to continually increase their efforts to gain illegitimate control and access to others' computers. Computer programmers with malicious motivations have created and continue to create viruses, Trojan horses, worms, and other programs meant to compromise computer systems and data belonging to other people. These malicious programs are often referred to as malware.
Security software companies are combating the growing tide of malware by creating and deploying malware signatures (e.g., hashes that identify malware) to their customers on a regular basis. For example, security software companies may send malware signature updates to their customers multiple times a day. By frequently updating malware signatures, security software companies may help their customers secure their computers against new and changing threats.
Each time a customer receives a malware definition update, the customer's computer may need to rescan numerous files to assure no malware is running on the computer. Consequently, customers' computers may take a performance hit each time they receive a malware signature update. Performance loss on customers' computers and networks increases as the size and frequency of signature updates increases. The performance loss may result in a negative customer experience.
Security software developers have tried to decrease the time and network traffic required to perform security scans by skipping known good files (e.g., files that are known to be free from malware). Before skipping a file, the security software typically must identify that the file is legitimate and free from malware. Security software developers have implemented at least two different methods to identify known good files to reduce security scan times.
In a first method, a client machine may keep a database of hashes of known good files. When performing a malware scan, the client machine may query the database to identify known good files. The client machine may then skip the known files, which may allow the scan to complete more quickly. However, maintaining a database of hashes of known good files may not be ideal. Maintaining the database may require frequent updates that increase network traffic. Also, the database may become large and may not provide the hoped-for efficiencies.
In a second method, a client machine may scan a drive. The client machine may compute hashes for the files stored on the drive and may transmit the hashes to a server. The server may then determine if the hashes correspond to known good files. This technique also has drawbacks. Sending file hashes to a server may create unnecessary client-server communications and may consume too much network bandwidth. Furthermore, many consumer Internet connections are non-symmetric, with the upload bandwidth being much less than the download bandwidth. Therefore, uploading a number of hashes from a customer's computer to a security software server may be a slow, resource-consuming process.