1. Field of Invention
This invention relates to electronic mail, specifically an improved method for filtering electronic mail.
2. Prior Art
Electronic mail, the electronic equivalent of paper based letters and memoranda, is a widely used means of written communication. Its primary advantages over other forms of written communication are its short delivery time and its low cost. These two factors contribute greatly to electronic mail's current popularity.
Unfortunately, this popularity comes with a price tag. Because electronic mail is easy, quick, and inexpensive, people seem inclined to create more of it. This forces the receiver to read and sort ever increasing quantities of it--a task that can take a considerable amount of time and effort.
Because of this, methods for automatically analyzing and filtering incoming electronic mail were developed.
Filtering
Most filtering methods build upon the premise that the receiver is not equally interested in all types of subject matter. Such methods perform some type of content based filtering on electronic mail. Content based filters examine the address of the sender of the electronic mail, the subject of the electronic mail, or the body of the electronic mail in order to decide what action to take.
One of the most popular stand-alone packages available, and one of the best examples of the state of the art, is a package called Procmail. Procmail, created by Stephen R. van den Berg, is an autonomous electronic mail processor. Once properly configured with a set of patterns and their associated actions, it goes to work on incoming electronic mail. Procmail examines each piece of electronic mail. It looks for a sequence of characters matching any one of the set of user-defined patterns. If such a sequence is found, Procmail performs the associated action on the piece of electronic mail.
An article by Arensburger and Rosenfeld, entitled "To Take Arms Against a Sea of Email", published in the Vol. 38, No. 3, March 95 edition of Communications of the ACM, describes a method for filtering electronic mail. Their scheme, nicknamed Jeeves, uses the MH mail system and a PERL package to filter electronic mail based on who it came from and who it was addressed to.
Examples of inventions claiming filtering functionality include:
U.S. Pat. No. 5,377,354 to Scannell et al. describes a method of prioritizing electronic mail based on stored rules. The system relies on keywords chosen by the user which, when found in the body of a piece of electronic mail, provide the basis for prioritization.
U.S. Pat. No. 5,555,346 to Gross et al. describes a system for triggering events in response to electronic mail messages. The events could include the filtering of electronic mail messages.
U.S. Pat. No. 5,613,108 to Morikawa describes a method of storing a data file written in an electronic mail in a folder based on a classification of the data file according to specific data contained in the electronic mail.
U.S. Pat. No. 5,619,648 to Canale et al. describes a technique for reducing the amount of electronic mail received by a user of an electronic mail system. Their solution involves adding, on the sender's side, non-address information to the electronic mail. The receiver's electronic mail filter has access to a model of the user and his or her preferences. The electronic mail filter uses the non-address information and the model information to determine whether or not the electronic mail should be provided to the user.
Junk Electronic Mail
A major annoyance associated with conventional mail is junk mail. Junk mail is typically unsolicited, is often distributed in large quantities, and is, by definition, of little interest to most of its recipients. Unfortunately, electronic mail systems suffer from their own version of the same.
Junk electronic mail is similar in spirit to junk mail, junk phone calls, and junk faxes. In each case the receiver receives unwanted material or solicitations from another party. Junk electronic mail, however, has two significant advantages the others do not. First, electronic mail is easy to send in large quantities. Off-the-shelf software can automate much of this process as well as assemble target mailing lists and handle responses. Second, electronic mail is inexpensive, especially when compared with the cost of bandwidth hungry network activities such as file transfers and image data transfers.
Junk mail (including junk electronic mail) depends for its success on it ability to satisfy two requirements. The first requirement is broad distribution. Broad distribution is necessary because typically only a small number, perhaps 1 or 2 percent, of the targeted recipients ever respond. The second requirement is low incremental cost. Because so few recipients respond, distribution costs will quickly consume profits--unless those distribution costs are small.
Junk electronic mail is successful because it satisfies both of these requirements extremely well. Any successful solution to the problem of junk electronic mail must address this fact.
A solution is necessary. The author already receives more junk electronic mail than non-junk electronic mail on any given day.
Filtering Junk Electronic Mail
Although originally designed to sort and categorize electronic mail, electronic mail filters (such as those cited above) are now being applied to the problem of detecting and rejecting junk electronic mail. Existing filters, however, have seen only moderate success. In large part, this is because they all suffer from several flaws:
First, content based filtering is difficult to do correctly.
As Canale et al. say in their background, "A problem with all such filters is that sorting for another person is difficult, even for a human being, and if a filter is going to be useful, it cannot do much worse than a human would".
This flaw shows itself in the following dilemma: content based filtering will sooner-or-later either allow through a piece of electronic mail it should have rejected, or worse, reject a piece of electronic mail it should have allowed through. Systems that depend entirely on content based filtering are especially vulnerable. A successful solution to the problem of junk electronic mail must either dramatically improve upon current content based filtering methods, or it must reduce the significance of the role content based filtering plays.
Second, none of the prior art cited above encourages the sender to voluntarily limit distribution.
Filtering electronic mail on the receiver's side is much more expensive and time consuming than simply limiting its distribution on the sender's side. Electronic mail is both easy to send and inexpensive. Therefore, there is no incentive for the sender to identify those receivers likely to be interested in the electronic mail, and to voluntarily limit its distribution to those individuals. On the contrary, the existence and use of content based filtering encourages just the opposite behavior--in order to meet response goals the sender must broaden his or her distribution. An effective solution to the problem of junk electronic mail must encourage the sender to do just the opposite.
Third, none of the prior art cited above provides a mechanism by which the sender can influence the receiver side filtering process by rating his or her own electronic mail.
Research has shown that an electronic mail filtering system can effectively use information provided by the sender to more efficiently rate a piece of electronic mail. The fatal difficulty has always been honesty. The information provided by the sender must honestly and reliably convey information about the importance of the piece of electronic mail to the receiver. A successful solution will result in a system in which the receiver can trust the sender's evaluation of the importance of the electronic mail.
This invention provides an improved method and apparatus for filtering electronic mail that addresses each of the flaws mentioned above.