The present invention relates in general to storage of transient message packets and, in particular, to a system and method for providing a multi-tiered hierarchical transient message store accessed using multiply hashed unique filenames.
Computer viruses, or simply xe2x80x9cviruses,xe2x80x9d are executable programs or procedures, often masquerading as legitimate files, messages or attachments that cause malicious and sometimes destructive results. More precisely, computer viruses include any form of self-replicating computer code which can be stored, disseminated, and directly or indirectly executed by unsuspecting clients. Viruses travel between machines over network connections or via infected media and can be executable code disguised as application programs, functions, macros, electronic mail (email) attachments, images, applets, and even hypertext links.
The earliest computer viruses infected boot sectors and files. Over time, computer viruses became increasingly sophisticated and diversified into various genre, including cavity, cluster, companion, direct action, encrypting, multipartite, mutating, polymorphic, overwriting, self-garbling, and stealth viruses, such as described in xe2x80x9cVirus Information Library,xe2x80x9d http://vil.mcafee.com/default.asp?, Networks Associates Technology, Inc., (2001), the disclosure of which is incorporated by reference. Macro viruses are presently the most popular form of virus. These viruses are written as scripts in macro programming languages, which are often included with email as innocuous-looking attachments.
The problems presented by computer viruses, malware, and other forms of bad content are multiplied within a bounded network domain interfacing to external internetworks through a limited-bandwidth service portal, such as a gateway, bridge or similar routing device. The routing device logically forms a protected enclave within which clients and servers exchange data, including email and other content. All data originating from or being sent to systems outside the network domain must pass through the routing device. Maintaining high throughput at the routing device is paramount to optimal network performance.
Routing devices provide an efficient solution to interfacing an intranetwork of clients and servers to external internetworks. Most routing devices operate as store-and-forward packet routing devices, which can process a high volume of traffic transmitting across the network domain boundary. These devices can be coupled to specialized antivirus systems that intercept transient messages at the network domain boundary to guard against the introduction of messages containing viruses, malware and other forms of bad content.
To ensure minimal effect on packet throughput, antivirus systems typically stage the intercepted messages in an intermediate store or queue pending processing by the antivirus system. The intermediate store, however, can cause delays in packet throughput and can potentially degrade network performance by creating a bottleneck at the network boundary due to processing delays.
One particular form of antivirus system combines packet screening and content scanning using functionally separate modules respectively to screen the contents of message header fields and to scan the contents of each message body and any attachments, including embedded attachments. Screened messages are staged in an intermediate message queue pending scanning. As the screener processes transient messages at a higher rate than the antivirus scanner, the message queue can potentially become saturated with screened messages and cause delay in packet delivery.
In addition, the actual messages staged in the intermediate message store are physically stored as individual files using the file system supported by the host upon which the antivirus system operates. File naming conventions and directory structures and capacities, though, are system-dependent and can vary greatly between different operating system platforms. Accordingly, each antivirus system must be customized to operate within the confines of each specific file system. As well, limitations in file names and directory capacity can rapidly be exceeded in a high packet throughput environment.
Therefore, there is a need for an approach to providing a portable intermediate storage structure for staging transient message packets intercepted at a network domain boundary. Preferably, such an approach would allow rapid message storage and retrieval using a unique file naming scheme.
There is a further need for an approach to supporting an extensible message queuing structure. Preferably, such an approach would allow dynamic and flexible capacity resizing.
The present invention provides a system and method for efficiently staging transient message packets in a portable intermediate message store. Incoming message packets are intercepted and screened for readily-discoverable characteristics indicative of an infected message. A unique filename is generated for each screened message and a pair of index node and storage node identifiers are calculated from the unique filename. The identifiers are stored in a unique filename table associated with the message. The message is physically stored in a hierarchical message store using the index node and storage node identifiers for subsequent retrieval and scanning.
An embodiment of the present invention provides a system and method for providing a multi-tiered hierarchical transient message store accessed using multiply hashed unique filenames. A hierarchical message store is maintained. The hierarchical message store is logically structured with a plurality of storage nodes. Each storage node is dependently linked to one of a plurality of index nodes. Each index node is dependently linked to a root node. An incoming message is intercepted at a network domain boundary and assigning a unique filename. An index hash of the unique filename, corresponding to one such index node, and a storage hash of the unique filename, corresponding to one such storage node, are generated. The message is stored in the hierarchical message store at the one such index node and the one such storage node.
A further embodiment provides a system and a method for providing a multi-tiered hierarchical transient message store accessed using multiply hashed unique filenames. A unique filename identifying an incoming message packet intercepted entering a bounded network domain is generated. An index checksum is calculated from the unique filename using a seed value associated with an index level in a hierarchical message store. A storage checksum is calculated from the unique filename using a seed value associated with a storage level in the hierarchical message store. The incoming message packet is stored in an index node in the index level and a storage node in the storage level and dependent on the index node. The index node and storage node are respectively indexed by the index checksum and the storage checksum.
Still other embodiments of the present invention will become readily apparent to those skilled in the art from the following detailed description, wherein is described embodiments of the invention by way of illustrating the best mode contemplated for carrying out the invention. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and the scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.