Computer viruses are program code usually causing malicious and often destructive results. All computer viruses are self-replicating. More precisely, computer viruses include any form of self-replicating computer code which can be stored, disseminated, and directly or indirectly executed. Computer viruses can be disguised as application programs, functions, macros, electronic mail attachments, and even applets and in hypertext links.
Computer viruses travel between machines via infected media or over network connections disguised as legitimate files or messages. The earliest computer viruses infected boot sectors and files. Over time, computer viruses evolved into numerous forms and types, including cavity, cluster, companion, direct action, encrypting, multipartite, mutating, polymorphic, overwriting, self-garbling, and stealth viruses, such as described in “MCAFEE.com: Virus Glossary of Terms,” NETWORKS ASSOCIATES TECHNOLOGY, Inc., (2000), the disclosure of which is incorporated by reference. Most recently, macro viruses have become increasingly popular. These viruses are written in macro programming languages and are attached to document templates or as electronic mail attachments.
Historically, anti-virus solutions have reflected the sophistication of the viruses being combated. The first anti-virus solutions were stand-alone programs for identifying and disabling viruses. Eventually, anti-virus solutions grew to include specialized functions and parameterized variables that could be stored in a data file. During operation, the data file was read by an anti-virus engine operating on a client computer. Finally, the specialized functions evolved into full-fledged anti-virus languages for defining virus scanning and cleaning, including removal and disablement, instructions.
Presently, most anti-virus companies store the anti-virus language code for each virus definition into data files. For efficiency, the source code is compiled into object code at the vendor site. The virus definitions, including the object code, are then stored into the data files. To speed virus detection, the virus definitions are organized for efficient retrieval often as unstructured binary data.
Anti-virus companies are continually discovering new computer viruses on a daily basis and must periodically distribute anti-virus software updates. Each update augments the data file with new computer virus definitions, as well as replacing or deleting old virus definitions. Over time, however, the size of the data files tend to become large and can take excessive amounts of time to download. Long download times are particularly problematic on low bandwidth connections or in corporate computing environments having a large user base.
Consequently, one prior art approach to decreasing anti-virus data file downloading times determines and transfers only the changes between old and new data files. The anti-virus company first compares old and new data files and forms a binary delta file. The delta file is downloaded by users and a patching utility program converts the old data file into the new data file by replacing parts of the binary data file. While this approach can often decrease the amount of data to be downloaded, the sizes of the delta files are arbitrary and vary greatly, depending upon the differences in binary data. In the worst case, the old and new data files are completely different and the delta file effectively replicates the new data file, thereby saving no download time.
While the use of delta files can effect throughput, changing the format of data files, particularly in a corporate computing environment, to avoid the use of delta files would create a further concern with respect to maintaining backward compatibility. Any new data file format change would necessitate replacing the existing data files on fielded client computers at potentially high cost due to downloads and installation.
Therefore, there is a need for an approach to efficiently distributing virus definitions to allowing updating in a backward compatible manner. Preferably, such an approach would store virus definitions maintained as indexed records in a database management system coupled with the ability to convert the virus definitions between formats. Such an approach would allow efficient virus definition updating while preserving existing data file formats.