With the advent of the Internet, email has become prevalent in digital communications. For example, email messages are exchanged on a daily basis to conduct business, to maintain personal contacts, to send and receive files, etc. Unfortunately, undesired email messages have also become prevalent with increased email traffic. Often, these email messages include unsolicited advertisements, which are often referred to as “junk mail” or “spam.” In some cases, these email messages contain software viruses that seek to adversely impact computer functions.
Some users may have email accounts that they never use, or use less frequently over time. In accordance with a user agreement, an Internet service provider (ISP) cannot access or close abandoned email accounts until a period of time has passed (e.g., after one year). During that time, the account may be continuously accumulating spam. Because spam messages are often image files or contain attachments that are larger than standard email text files, spam messages tend to consume a disproportionate amount of resources. The ISP is responsible for storing all of the received messages on ISP servers thereby wasting storage system resources and potentially increasing operating costs.
Currently, software applications exist which remove some of the spam or junk mail from a recipient's email account, thereby reducing mail box clutter. Some of these applications remove email messages that contain a particular text string or character(s) or types of content (e.g., large image files) that may indicate that the email message is spam or junk mail. Email messages that are determined to be spam or junk mail are then either removed (e.g., permanently deleted, stored in a recycle bin, etc.) or stored in a designated folder (e.g., “trash” folder, “junk” folder, etc.).
One type of email message filtering application compares a signature associated with an email message to a list of signatures that identify email messages known to include unwanted content (e.g., spam, a virus, etc.). If there is a signature match, the email message containing the unwanted content is discarded. If the signature of the email message does not match a signature in the list (e.g., because the email message has not been identified as including unwanted content), the email message is presumed to be legitimate and is allowed to be stored in the subscriber's mail system inbox.
The algorithms employed to compare a signature associated with an email message to a list of signatures that identify email messages known to include unwanted content are performed on-the-fly (i.e., essentially in real time or near real-time) when the email message enters a gateway or other element coupled to a mail server. However, in some operational situations and with certain types of algorithms, the algorithms may not have enough to time to thoroughly scan each email message. Thus, an email message may not be accurately identified as containing unwanted content before the email message is forwarded to a mail box.
Furthermore, the signature list may not include signatures for all email messages that include unwanted content. For example, a signature may not be included in the signature list because the unwanted content has been recently generated and the signature list has not been updated by the time the email message is sent to the recipient. Thus, an email message that includes unwanted content may be delivered to a recipient's mail box because the signature for that email message is not included in the signature list. The signature list may be subsequently updated to include the signature. However, the email message has already been delivered to a recipient's mail box. Thus, it is too late for the unwanted content to be filtered from the recipient's email account in the usual manner.
Therefore, what is needed is a way to detect unwanted digital content that was not detected by conventional mail or message filters.