The invention relates generally to computer systems, and deals more particularly with a technique to reduce the impact of spam on a mail server.
The Internet is well known today, and comprises a vast number of user computers and servers interconnected via routers, firewalls and networks. One role of the Internet is to provide a medium to exchange e-mail. A common problem today is “spam”, where a source server sends commercial e-mails via the Internet to numerous (thousands, even millions of) user computers via the user's mail servers. Each mail server provides an e-mail transfer function for multiple user computers. The spam clogs the Internet, mail servers and mail boxes of the user computers. Some mail servers may be so busy handling the spam, that they have little time to handle/transfer legitimate e-mail. As a result, the legitimate e-mail is handled very slowly.
It is known that an enterprise's intranet can be protected with a firewall. The firewall may be located between the enterprise's mail server and the Internet. The firewall can be programmed to block e-mail from source IP addresses of likely spam servers. Spam detectors and filters are well known today such as “Spam-Assassin™” (trademark of Apache Software Foundation) program. Typically, the spam detector and filter are installed at an edge router or a firewall for a mail server. The spam detector reviews incoming e-mail and calculates a “spam likelihood score” for each e-mail based on its characteristics and the weight of each characteristic. These characteristics include (a) key words characteristic or marketing material such as “free” and “real-estate”, etc. (b) whether the e-mail is HTML type, (c) whether the e-mail is malformed HTML type (which is more characteristic of marketing material than carefully written HTML), (d) whether the e-mail text omits the first or last name of the intended recipient, (e) whether the subject line is blank or has certain words characteristic of marketing, (f) whether the identity listed in the “from” field matches the location of the source IP address, (g) whether the e-mail includes colors, (h) whether the e-mail has some text in larger font than ordinarily used for noncommercial e-mail, and (i) whether the text is similar to or identical with other e-mails from the same source. A known spam detector can also consider when multiple, similar e-mails (i.e. the same or substantially the same text or the same subject) are addressed to multiple different recipients/users and originate from the same source IP address. The spam detector would ignore e-mails sent from known, legitimate sources, such as e-mails from employees of the same corporation to which the e-mails are sent; these e-mails are not considered to be spam. The legitimate sources may be found in a list supplied by a system administrator, and accessible to the spam detector. If the spam likelihood score exceeds a predetermined upper threshold, then the e-mail is very likely to be spam. In such a case, the spam detector reads the IP address of the sender, and then blocks subsequent e-mails from the same IP address and/or e-mail address by creating a corresponding spam filter rule. Each spam filter rule may specify a source IP address and/or e-mail address from which e-mail will not be accepted. The spam filter rule is enforced at the firewall or router, or a gateway server in the absence of a firewall or router. The spam filter rule may be in effect indefinitely or for a predetermined amount of time, but can be periodically removed when there are too many filters to efficiently handle.
A problem with the foregoing spam blocking technique is that some e-mails are erroneously presumed to be spam based on their spam likelihood score or other factors. For example, a CEO or customer of a corporation may send an e-mail from an unrecognized computer to a large number of employees of the same corporation. In such a case, the known spam detector may presume the e-mail to be spam, and block it.
An object of the present invention is to better manage suspected spam which may actually be legitimate e-mail.