As more companies and individuals rely on email systems to communicate, there has been a rapid increase in spam and other undesirable email messages. Spam email messages are unsolicited email messages that are not of interest to the recipient, usually sent to many recipients at one time. Examples of undesirable email messages include computer viruses, programs or piece of computer code that is loaded onto a user's computer without the user's knowledge and runs against the user's wishes; phishing attacks, where an e-mail is sent to a user falsely claiming to be an established legitimate enterprise in an attempt to scam the user into surrendering private information that will be used for identity theft; and pharming attacks where an email is sent requesting the user to visit a web site which appear legitimate, but the user is actually redirected to a web site where the user is encouraged to surrender private information that will be used for identity theft. Accordingly, methods have been developed to detect spam and other undesirable emails before they reach a user.
In one of the models in use today, an underlying classification engine maps each email to a value on a proprietary ordinal scale. In such prior models, the scale offers a finite number of buckets. To actually filter email, each value of that scale is mapped to an output device, such as inbox, junk email folder on the client side, junk email folder on the server side (also called quarantine folder), and recycle bin. When using the scale, emails with low scores are to be mapped to the inbox and emails with high scores to one of the three spam containers.
But, spam filtering is not an exact science. Mapping emails with middle scores is a matter of tradeoffs. Directing more email scores to the spam destinations will reduce the amount of spam that will reach the user, yet it will also filter out additional emails that should have reached the user. The table below illustrates the tradeoff.
Email is validEmail is spamEmail considered validGoodSpam in inbox(and sent to inbox)Email considered spamValid email filtered outGood(and filtered out)