The present invention relates to anti-virus protection, and more particularly to a virus detection system in assembly-like code.
Computer viruses are executable files or attachments often hidden or disguised as legitimate files or messages. More precisely, computer viruses include any form of self-replicating computer code which can be stored, disseminated, and directly or indirectly executed by unsuspecting clients. Viruses travel between machines over network connections or via infected media and cause malicious and sometimes destructive results. Viruses can be executable program or macro code disguised as application programs, functions, macros, electronic mail attachments, and even applets and hypertext links.
The earliest computer viruses infected boot sectors and files. Over time, computer viruses evolved into numerous types, including cavity, cluster, companion, direct action, encrypting, multipartite, mutating, polymorphic, overwriting, self-garbling, and stealth viruses. Recently, macro viruses have become popular. These viruses are written as scripts in macro programming languages and are attached to documents and electronic mail attachments.
Historically, anti-virus solutions have reflected the sophistication of the viruses being combated. The first anti-virus solutions were stand-alone programs for identifying and disabling viruses. Eventually, anti-virus solutions grew to include special purpose functions and parameterized variables that could be stored in data files read by the anti-virus engine. Over time, the special purpose functions evolved into specialized anti-virus languages for defining virus scanning and cleaning, including removal and disablement, instructions.
The data files store virus definitions. Each virus definition includes object code executed by an anti-virus engine on each client. As new computer viruses are discovered daily, each data file must be periodically updated to add new computer virus definitions, and replace or delete old virus definitions. Over time, data files tend to become large and can take excessive amounts of time to download. Long download times are particularly problematic on low bandwidth connections or in corporate computing environments having a large user base. Data files are also often platform-dependent and updates must be hard-coded into each different type of data file.
Upgrading anti-virus engines in a corporate computing environment can require considerable effort and time. Each anti-virus engine is limited to performing only those operations defined in the associated anti-virus language. Consequently, any changes or extensions to the language typically require the patching or replacement of the engine and can consume considerable resources in debugging and testing. In addition, anti-virus engines are implemented for specific computing environments, generally dependent on the type and version of operating system. Changes or upgrades to an anti-virus engine, therefore, must be propagated across all computing platforms and can present critical portability issues.
One prior art approach avoids the need to patch or replace the anti-virus engine by including the entire engine as part of the data files. Each new virus definition accordingly results in a new engine. However, such an approach to upgrading is slow and bandwidth-intensive. As well, including an anti-virus engine as part of a computer virus definition data file is misleading, as security policies controlling software download and installation are subverted.
Wireless and other thin client devices present further challenges. Typically, anti-virus engines and associated signature files are large making them impractical for storage in the memory of thin client devices. Further, thin client devices typically do not have the computing power of the personal computers and other devices for which traditional anti-virus software is written.
Therefore, there is a need for an approach to providing a flexible and extensible anti-virus solution that avoids the limitations of a special purpose anti-virus language and the limited capabilities of the corresponding anti-virus engine. Preferably, such an approach would provide an anti-virus engine capable of supporting new functionality not originally anticipated.
What is further needed is a methodology for providing such a flexible and extensible anti-virus solution for use on thin client devices, including wireless devices. Further, the solution should include an anti-virus engine and signature file having smaller file sizes and requiring less computing power than existing languages.
Also needed is a way to add new capabilities to a scanning system without requiring bandwidth-intensive and time consuming engine updates.
A system, method and computer program product are provided for programmable scanning for malicious content on a wireless client device. Initially, an anti-virus program having an instruction set is assembled in a programmable assembly-like computing language. The anti-virus program is implemented in a wireless client device. A scan for malicious code is performed on the wireless client device utilizing the anti-virus program. Note that this can include scanning a memory of the device as well as an inbound or outbound data stream traversing a communication port of the client device.
Some benefits of using programmable assembly-like code for anti-virus scanning include its flexibility, speed and size, as will become apparent upon a reading of the description that follows. Assembly-like anti-virus detection language is highly efficient, in both performance and size, compared to traditional detection languages. Because the engine executes on a simple yet highly programmable instruction set, it is smaller and faster. Further, the virus signature file can be potentially smaller since it contains compiled/interpreted code from assembly source, not a high-level script or programming language such as C. The size can be further reduced by merging scan information for multiple types of malicious code. Instead of containing instructions such as xe2x80x9clook for virus x for each X,xe2x80x9d scanning according to one embodiment is performed using an instruction such as xe2x80x9cLook for all patterns in X, and declare x if found.xe2x80x9d This helps by eliminating non-infected files quickly and reduces size requirements by merging the signature information.
According to one embodiment, the simpler instruction set in the programmable assembly-like computing language are based on instructions from an existing anti-virus program (which includes any engine and/or signature file for detecting any type of malicious code). Preferably signature information_of the pre-existing anti-virus program is merged into a single instruction in the programmable assembly-like computing language. By providing a less-complex scan engine and providing functionality via the signature file,_flexibility is enhanced, which is ideal for wireless applications.
According to another embodiment, the instruction set is capable of implementing the functionality of a Discrete Finite Automation (DFA) in a programmable assembly-like computing language. This allows detection of multiple viruses at the same time without having to scan for them individually. In such an embodiment, the machine begins with a pointer into the input stream and a start state. Based on what byte is found at the pointer, the machine moves to a specified state. For each transition, the pointer is moved forward to the next byte. The machine ends with a stop state that identifies which infection was identified or none. The DFA""s for several types of malicious code can be combined into a single DFA that scans for all such types of malicious code at the same time.
The wireless client device can be a wireless telephone, a personal digital assistant, a handheld computer including a Blackberry-type device or PocketPC, a pager, etc. The instruction set preferably includes instructions for cleaning infected data. Such instructions can include instructions for deleting an item, truncating a file, copying bytes from one location to another, and/or overwriting bytes in a stream. The anti-virus program includes a signature file used by an anti-virus engine to identify malicious code. The signature file is preferably compiled utilizing the programmable assembly-like computing language. This allows the signature file to be smaller than it would be if the signature file were compiled from C. Preferably, the signature file includes an identifier uniquely identifying an instance of malicious code, a malicious code detection section comprising object code providing operations to detect the identified computer virus in the wireless client device, and an extension sentence comprising object code providing reusable operations implemented in the programmable assembly-like computing language.
A method for programmable scanning for malicious content on a thin client device is also provided. An anti-virus engine is assembled in a programmable computing language. The anti-virus engine is installed on a thin client device. A signature file is also assembled in a programmable computing language, the signature file containing an identifier uniquely identifying a computer virus and a virus detection section comprising object code providing operations to detect the identified computer virus on the thin client device. The signature file is also installed on the thin client device. The anti-virus engine is initiated for scanning for malicious code on the thin client device utilizing the signature file.
In one embodiment, an extension sentence is added to the signature file. The extension sentence includes object code providing reusable operations implemented in the programmable computing language. In another embodiment, the anti-virus engine utilizes discrete function automation for pattern matching. Preferably, discrete function automations for several types of malicious code are combined in a single discrete function automation for scanning for the types of malicious code simultaneously. The thin client device can be a wireless telephone, a personal digital assistant, a handheld computer, a pager, etc. The signature file preferably includes instructions for cleaning infected data. Such instructions can include instructions for deleting an item, truncating a file, copying bytes from one location to another, and/or overwriting bytes in a stream.