1. Field of the Invention
This invention relates to the field of data processing systems. More particularly, this invention relates to malware scanning received messages.
2. Description of the Prior Art
It is known to provide malware scanners that scan received messages for malware such as computer viruses, worms, Trojans, banned files, banned words, banned images and the like. An example of such a malware scanner is one in which a MIME message received by an e-mail system is scanned to see if it contains malware of any of the above mentioned types. The MIME message protocol is widely used to transfer e-mail messages. It is common for e-mail messages to contain one or more attached files. These attached files often constitute the malware against which it is designed to protect the system. The MIME message format divides the total message into different portions respectively containing an encoded version of the attachment and separated by predetermined tags. When malware scanning such a MIME message the entire MIME message must be processed to identify the tags which separate different portions of the message and then those separate portions decoded and malware scanned as required. Whilst the MIME message format is highly adaptable and flexible, this format presents a difficulty to malware scanners in that a disadvantageously large processing requirement is imposed by the need to traverse the entire MIME message to identify all its portions and then decode those portions prior to scanning.
Another disadvantage of MIME messaging is that the payload data is encoded. Thus, a computer file being transferred within a MIME message is encoded into a new form which is included within the message and requires decoding by the receiver in order to recover the original computer file. This is inefficient in terms of the increased computer processing required. Furthermore, certain computer files may be in a form that is highly compressed and the encoding may make them disadvantageously larger. Furthermore, digital signature and other security measures may be disrupted by the encoding and decoding imposed by the MIME message format.
In order to address the above problems of the MIME message format that arose through encoding and decoding of computer files, a new message format has been proposed. This is the DIME format. In this message format computer files are embedded within the message in their native binary form without encoding. As the binary sequence within the embedded data is no longer controlled by the message format, the use of tags to separate different portions of the message can no longer be reliably used since a computer file may as a matter of chance contain a particular sequence of bytes that corresponds to a tag and would be inappropriately interpreted as a division between different portions of a message. Instead, the DIME format breaks the message down into a plurality of data records each having a header including data indicating the length of that data record such that the message can be read and broken down into its respective data records at the receiver.