Undesired email, commonly referred to as SPAM, is generally defined as bulk unsolicited email, typically for commercial purposes. SPAM is a significant problem for email administrators and users. At best, SPAM utilizes resources on email systems, requires email account holder's time to review and delete and is generally frustrating and troublesome. At worst, SPAM can include malicious software and can damage software, systems and/or stored data.
Session Initiation Protocol (SIP) based voice communications are also subject to undesired messages and such undesired messages are also referred to herein as SPAM. While not yet common, voice related SPAM is expected to become a common problem as more users migrate from plain old telephone service (POTS) to SIP-based voice communications. For example, it is possible to send unsolicited commercial messages to every voice mailbox at an organization, utilizing system resources and wasting users' time to review and/or delete the SPAM messages.
Much work has been undertaken in recent years to combat the growing problem of SPAM. One of the methods used to date to reduce undesired email SPAM is the use of Bayesian filtering wherein the content of received emails is examined for specified content to form a statistical decision as to whether the email constitutes SPAM. A message which is deemed to be SPAM can be flagged as such and/or directed to a selected storage folder or deleted from the system. While such filters do recognize many SPAM messages, the originators of the SPAM messages are constantly changing their messages in, often successful, attempts to fool the filters.
Co-pending U.S. patent application Ser. No. 11/357,164 to Fogel, filed Feb. 21, 2006 and entitled, “System and Method For Providing Security For SIP-Based Communications” describes a security appliance and some methods which can be useful to reduce the occurrence of voice SPAM and the contents of this application are incorporated herein by reference.
Another method commonly employed to date is the use of blacklists which identify IP addresses from which messages deemed to be undesired have previously been received and which deem all subsequent messages from those IP addresses as being undesired messages. While blacklists can be effective, they suffer from being very coarse-grained in that they do not distinguish between messages sent from a bone fide user at an IP address and SPAM sent by SPAM originators from that same IP address.
Instead, once the IP address has been identified and blacklisted as being an IP address used to originate SPAM, messages from the bona fide users will no longer be accepted at systems which have blacklisted the IP address. As many Internet Service Providers (ISPs) host multiple email and/or SIP domains at a single IP address, this blacklisting of domains can affect a large number of bona fide users.
More recently, reputation-based techniques have been employed to assist in identifying undesired messages. Such reputation-based techniques comprise database systems which maintain statistics for an IP address and these statistics are compiled from the output of other anti-SPAM systems, such as the above-mentioned Bayesian filter or SIP systems. The statistics indicate the frequency with which SPAM is transmitted from the IP address and can include other information such as whether the sending IP address is a static or dynamic address.
Reputation-based techniques rely upon an analysis of the past activity from an IP address to provide an indication of a likelihood that a new message sent from that IP address is SPAM.
When a messages is received at an email server or SIP proxy, the reputation for the originating IP address is checked in the database and the “reputation” (i.e.—the statistics compiled) for that IP address can be used as one of the inputs to an anti-SPAM process.
Another reputation-based technique for emails is disclosed in the paper, “Sender Reputation in a Large Webmail Service”, by Bradley Taylor, presented at CEAS 2006—Third Conference on Email and Anti-Spam, Jul. 27-28, 2006, Mountain View, Calif. This technique creates a reputation for each domain (which are authenticated through other means) from which an email message is received and uses the created reputation as an input to a SPAM detection process.
While reputation-based techniques can be an improvement over Blacklisting, they do suffer from some of the same problems and, in particular, they suffer a lack of granularity which can result in all messages from an IP address or all messages from a domain being identified as SPAM because SPAM has previously been sent from that IP address or domain. As mentioned above, this can result in a large number of bona fide users being adversely affected as a result of the activities of a few originators of SPAM.
It is desired to have a reputation-based system and method for determining a likelihood that a message is undesired which permits finer granularity in tracking reputations.