Unsolicited Commercial Email (UCE), also commonly referred to as spam, is widely considered a detriment to productivity, and a potential source of undesired or inappropriate material in office and home environments.
UCE can be severely limited if a user's email address is removed from the lists maintained to transmit UCE. Though there has been a great deal of discussion about legislative remedies forcing UCE transmitters to allow users to get their addresses removed from these lists, legislative actions are inherently restricted to a single jurisdiction and will likely simply move the transmitters out of the legislative jurisdiction. There is a wide appreciation that a technical mechanism to prevent email addresses from being harvested from websites, mailing lists, Usenet newsgroups, and other such public places is required.
One method that senders of UCE use to build their mailing lists is to harvest of email addresses published on websites, and provided to Usenet newsgroups. Email addresses given to mailing lists are commonly available in archives that are placed on websites to allow people to search through previous discussions. Email addresses can be identified on web pages by identification of their common structure. This is typically done by an automated harvesting program designed to scour world wide web pages for email addresses, much as search engines scour the same web pages to build an index. Providing email addresses to outside entities, either via mailing lists or web pages, is essential for conducting business, but also provides the senders of UCE with addresses to which UCE can be sent. Many users try to obfuscate email addresses included in web pages using HTML escape codes, the word “at” instead of the symbol “@” and other similar techniques. UCE list builders have adapted to this obfuscation and harvesting programs now recognise the popular obfuscation methods, defeating the obfuscation.
To combat UCE, many organizations have deployed filters to identify UCE at the recipient mail server. Upon identification as UCE, a message is typically either deleted or identified as UCE to allow the user's mail client to sort the message into an appropriate folder. Unfortunately, because filters are not perfect, users often must review the identified messages to determine if there have been any false-positive results. Additionally, some UCE passes through the filters without being identified, and must be manually deleted by the user.
Known filters typically rely upon an origination address in the message header, names or addresses of the transmitting mail server, real time black hole lists and sophisticated heuristic analysis of the subject line to identify UCE. Senders of UCE attempt to bypass filters by forging headers, using nondescript subject lines and by finding so-called open relay mail servers which do not appear on real time black hole lists.
It is also well understood that many filters increase their false positive rate as the filter is tuned to increase the overall identification rate. Many users consider this highly ineffective in a commercial setting, as any message from a previously unknown source can be a potential business opportunity. As a result, many messages identified as UCE must still be examined if the filter is too aggressive in its structure as it will identify all messages from sites on the list as UCE regardless of their sender or content.
There is a general consensus among those skilled in the art that filtering is inexact and as a result it is possible to bypass filters by changing strategy and message structure.
Filters operate in conjunction with mail servers by scanning the header and body of incoming messages. A conventional mail server receives messages on a predefined port, typically port 25 for simple mail transfer protocol (SMTP) traffic. These messages have both an envelope section and a payload. The envelope of a message indicates the recipient of the message, while the payload has both a message header and body. The message header typically contains information regarding the sender, and some of the routing information associated with the message. The message header includes a destination address, that need not be the same as the address provided on the envelope. This mismatch of addresses is most commonly used for blind copy (bcc) fields on messages to allow a message to be sent so that the recipient is unaware of the full extent of the address list. A receiving mail server typically routes received messages based on the address provided in the envelope.
In most instances, the address on the envelope corresponds to an account hosted by the mail server, but in other cases the address corresponds to an account hosted by a second mail server, and the first mail server simply transmits the message to the second mail server. This technique is well known and is employed when a mail server stores a list of forwarding addresses for mail redirection services, or in the event that the user for which a message is addressed has provided a forward file.
The ability of a mail server to route mail to another server based on the address in the message envelope has been exploited to provide users with pseudonymous email addresses, more popularly referred to as aliases. Aliases are provided by mail redirection services, and by some private mail servers to allow users to have multiple email addresses. Mail addressed to any of the aliases may be routed to a single mail account. This allows a user to provide unique addresses in different environments and then have his mail client perform routing and filtering on the basis of the address that the message was sent to. Using these techniques, a user can provide one address on a web page, and another address to a mailing list, and thus sort the messages on the basis of the address to which the message was sent.
While users can opt to delete an alias, if the address was provided to a mailing list the user must then subscribe to the mailing list using the new alias, or the mailing list messages cannot be routed to the user. In the case where an alias has been posted on a website, the website must be altered to direct viewers to the proper email address. Because of the rate at which senders of UCE are able to find addresses posted on common websites, or on websites hosting mailing list archives, it is common that an address posted in this fashion will start to receive increasingly large volumes of UCE within a week. The overhead associated with updating web pages, and changing mailing list subscription information on such a frequent basis renders this process inefficient. As a result the use of aliases has not been adopted as a common UCE control mechanism.
FIG. 1 illustrates a known alias routing system. Alias generator 50 provides an alias associated with an email address that is typically provided to a system connected to Internet 104. The alias information is provided to mail server 52, which stores the routing associated with the generated alias in routing list 58. Upon receiving a mail message over SMTP port 56, mail server 52 looks up the alias to which the message is addressed in routing list 58 and redirects the message to that address using routing engine 60. The address associated with the alias can be on the same mail server or a completely different server. One skilled in the art will appreciate that other functionality is offered by these servers depending on how they are administered. A more detailed description of how conventional alias generators operate is provided below.
Alias address management and creation is presently offered by a number of vendors. Typically, these alias generators are provided to users via either a website or a standalone application executed on the user's computer. From the perspective of a user, conventional alias generators are complex to use. Upon being prompted for an email address in a web based form, a user must either initiate a second instance of the browser, go to a generation page, request a newly generated alias, and then copy and paste it into the form, or the user must launch a standalone application, request an alias, and then copy and paste it into the form. In both these cases, the user is required to cut and paste, or copy-type, the alias into the web based form. Additionally, if information about whom the address was provided to is to be kept, the user must provide that information to the generator. This interaction with a different application typically increases the likelihood that the user will not fully use to functionality of the alias manager, thus diminishing its effectiveness.
Presently, standalone applications are available that work in conjunction with an alias server to provide a degree of alias management. Upon registering for the service, the user provides to a server a root, upon which all aliases will be based. This root is also provided to the standalone application. The server then associates the root with an email address. When the user generates an alias using the standalone application, the root is used, though it may be obfuscated in the process, to generate an alias. When email is sent to the alias, the server decodes the username portion of the email address and determines the address that it should be redirected to. This process does not require the standalone application to interact with the server, as all aliases based upon a single root can be mapped back to the original root, which is associated with a predefined email address. This approach reduces the computational load on the server, but in order to guarantee that the alias is not reused, the standalone application must have a used alias list, which would be reset if the application ever had to be re-installed. Further, the used alias list cannot be easily shared between two systems, creating difficulty for users that have more than one machine, such as a laptop for travel and a desktop for in office use. However, one weakness of the system is that in order to reduce the computational load on the server, there is no interaction with the standalone application, and all aliases that can be mapped back to the root name are forwarded to the email address provided. Thus, if a UCE transmitter can discern the root name through either inspection, or through comparing a series of provided aliases, the security of the system is compromised, and the user is then subject to the receipt of UCE.
To overcome these difficulties, web-based alias generators eliminate the standalone component of the above described system, and store the alias list at the server. This allows the mail server received an email addressed to an alias to determine if the alias is valid prior to forwarding the message. Additionally, it allows the operator of the server to provide the user with the ability to use multiple computers without running the risk of creating non-unique aliases. Uniqueness in the addresses is important, as it allows a user to uniquely identify an address with a website. This allows the user to determine the source of UCE. Web based generation systems still require the user to load a new web page to generate an alias, and then require the alias to be copied into the web based form. To allow for alias management, some web based alias generators alter messages addressed to the alias, so that when they are relayed to the correct email address a hyperlink is embedded within the message. This hyperlink allows a user to disable the alias upon determining that it is being used to transmit UCE. By embedding the management link, the user is provided with a mechanism for deleting an alias seamlessly, as the link can be directed to a HTML page that transmits an alias termination request to a server, and then is closed using a JavaScript™ command. However, the embedding of a management link causes problems when the body of an email message is digitally signed to authenticate the contents and prevent tampering. Though the embedding of a management link provides a seamless management functionality, it is incompatible with signed message bodies, and does not provide the user with a seemless alias creation mechanism.
It is, therefore, desirable to provide a method of reducing the incidence of receipt of UCE, while reducing the amount of user interaction in the process.