A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates generally to electronic mail (e-mail) systems and, more particularly, to improved methodology for processing an e-mail message sent to a predefined mailing list (specifying multiple recipients).
Today, electronic mail or xe2x80x9ce-mailxe2x80x9d is a pervasive, if not the most predominant, form of electronic communication. FIG. 1 illustrates the basic architecture of a typical electronic mail system. At a high level, the system includes a mail server connected over a network to various e-mail xe2x80x9cclients,xe2x80x9d that is, the individual users of the system. More specifically, the system 10 includes one or more clients 11 connected over a network to at least one Message Transfer Agent (MTA) 12a. Communication occurs through a standardized protocol, such as SMTP (Simple Mail Transport Protocol) in the context of the Internet.
A typical e-mail delivery process is as follows. In the following scenario, Larry sends e-mail to Martha at her e-mail address: martha@example.org. Martha""s Internet Service Provider (ISP) uses an MTA, such as provided by Sendmail(copyright) for NT, available from Sendmail, Inc. of Emeryville, Calif. (With a lower case xe2x80x9cs,xe2x80x9d xe2x80x9csendmailxe2x80x9d refers to Sendmail""s MTA, which is one component of the Sendmail(copyright) for NT product.)
1. Larry composes the message and chooses Send in Microsoft Outlook Express (a xe2x80x9cmail user agentxe2x80x9d or MUA). The e-mail message itself specifies one or more intended recipients (i.e., destination e-mail addresses), a subject heading, and a message body; optionally, the message may specify accompanying attachments.
2. Microsoft Outlook Express queries a DNS server for the IP address of the host providing e-mail service for the destination address. The DNS server, which is a computer connected to the Internet running software that translates domain names, returns the IP address, 127.118.10.3, of the mail server for Martha""s domain, example.org.
3. Microsoft Outlook Express opens an SMTP connection to the mail server running sendmail at Martha""s ISP. The message is transmitted to the sendmail service using the SMTP protocol.
4. sendmail delivers Larry""s message for Martha to the local delivery agent. It appends the message to Martha""s mailbox. By default, the message is stored in:
C: Program Files Sendmail Spool martha.
5. Martha has her computer dial into her ISP.
6. Martha chooses Check Mail in Eudora.
7. Eudora opens a POP3 (Post Office Protocol version 3, defined in RFC1725) connection with the POP3 (incoming mail) server. Eudora downloads Martha""s new messages, including the message from Larry.
8. Martha reads Larry""s message.
The MTA, which is responsible for queuing up messages and arranging for their distribution, is the workhorse component of electronic mail systems. The MTA xe2x80x9clistensxe2x80x9d for incoming e-mail messages on the SMTP port, which is generally port 25. When an e-mail message is detected, it handles the message according to configuration settings, that is, the settings chosen by the system administrator, in accordance with relevant standards such as Request For Comment documents (RFCs). Typically, the mail server or MTA must temporarily store incoming and outgoing messages in a queue, the xe2x80x9cmail queue.xe2x80x9d Actual queue size is highly dependent on one""s system resources and daily volumes.
MTAs, such as the commercially-available Sendmail(copyright) MTA, perform three key mail transport functions:
Routes mail across the Internet to a gateway of a different network or xe2x80x9cdomainxe2x80x9d (since many domains can and do exist in a single network)
Relays mail to another MTA (e.g., 12b) on a different subnet within the same network
Transfers mail from one host or server to another on the same network subnet
To perform these functions, it accepts messages from other MTAs or MUAs, parses addresses to identify recipients and domains, resolves aliases, fixes addressing problems, copies mail into a queue on its hard disk, tries to process long and hard-to-pass messages, and notifies the sender when a particular task cannot be successfully completed. The MTA does not store messages (apart from its queue) or help users access messages. It relies on other mail system components, such as message delivery agents, message stores and mail user agents (MUAs), to perform these tasks. These additional components can belong to any number of proprietary or shareware products (e.g., POP3 or IMAP servers, Microsoft Exchange, IBM Lotus Notes, Netscape, or cc:Mail servers, or the like). Because of its central role in the e-mail systems, however, the MTA often serves as the xe2x80x9cgluexe2x80x9d that makes everything appear to work together seamlessly.
For further description of e-mail systems, see e.g., Sendmail(copyright) for NT User Guide, Part Number DOC-SMN-300-WNT-MAN-0999, available from Sendmail, Inc. of Emeryville, Calif., the disclosure of which is hereby incorporated by reference. Further description of the basic architecture and operation of e-mail systems is available in the technical and trade literature; see e.g., the following RFC (Request For Comments) documents:
currently available via the Internet at the disclosures of which are hereby incorporated by reference. RFCs are numbered Internet informational documents and standards widely followed by commercial software and freeware in the Internet and UNIX communities. The RFCs are unusual in that they are floated by technical experts acting on their own initiative and reviewed by the Internet at large, rather than formally promulgated through an institution such as ANSI. For this reason, they remain known as RFCs even once they are adopted as standards.
Often when sending e-mail, a distribution or xe2x80x9cmailing listxe2x80x9d is employed to facilitate the process of sending an e-mail message to a group of people. For instance, instead of addressing an e-mail message to individual members of a recurring group, a user can instead simply define a mailing list to comprise those members. For example, the user could define a xe2x80x9cMarketingxe2x80x9d mailing list that specifies members of the marketing department of the user""s company. Once defined, the mailing list can be used in the recipient field for an e-mail message, in lieu of listing individual members. A message sent to this distribution list goes to all recipients listed. Typically, e-mail systems provide graphical user interface facilities for managing (e.g., adding and deleting) names in a mailing list.
Expectedly, as a particular list grows larger, it becomes progressively more resource intensive and time consuming to manage and process. Although the foregoing example of a mailing list for a marketing department may comprise a comparatively small group of recipients (e.g., less than 100), a mailing list can in fact specify an extremely large group of recipients. Consider, for instance, a mailing list defined for customer support (e.g., xe2x80x9cNorth American Usersxe2x80x9d) for a large software company. As another example, ISPs (Internet Service Providers) typically support many domains, many lists within each domain, and many users for each list. In such a case, a given mailing list may in fact specify many thousands or even millions of recipients, leading to an incredible amount of mailing list traffic. Accordingly, there is great interest in improving the management and processing of mailing lists so that e-mail sent to mailing lists, particularly large ones, are processed in an efficient manner.
In an electronic mail system, the task of processing a mailing list usually falls to a Mailing List Manager or xe2x80x9cMLMxe2x80x9d, such as MLM 13 for the e-mail system for FIG. 1. Upon receiving an e-mail message sent to a predefined mailing list, the system""s MTA hands off the message, with the name of the list, to the system""s MLM. After checking the message, the MLM enumerates the individual recipients for the list and hands the message with a list of the specific intended recipients (i.e., with the names/e-mail addresses of the specific intended recipients attached) back to the MTA for redistribution. For instance, if the message had a mailing list specifying 100 recipients, the MLM would, after finishing its work, post the message back to the MTA with each of the 100 recipients specified. Here, the MLM opens a connection (e.g., xe2x80x9cpipexe2x80x9d in UNIXxe2x80x94a direct data feed) to the MTA. The MTA is responsible for queuing up the message and arranging for its distribution to all of the various recipients.
Without further enhancement to this basic process of handling an e-mail message with a large mailing list, the MLM is handing a substantial amount of work to the MTA to do, with no real intelligence. For instance, for a message sent to a predefined mailing list of 1000 recipients, the MLM is handing to the MTA a list of 1000 tasks to do in sequencexe2x80x94that is, 1000 messages to queue and distribute. At the same time, MTAs tend not to be very good at parallel delivery of a single message. Therefore, the approach commonly employed by MTAs is to do the tasks in series, one at a time. However, that approach incurs the penalty of increased delivery time. Accordingly, there is much interest in increasing the speed of message delivery by the MTA, so that total delivery time for messages is decreased. To date, existing systems have failed to adequately address this problem and, as a result, system performance in such a scenario is poor.
One approach, such as was attempted by another MLM called xe2x80x9cListmanager,xe2x80x9d is to take the message and break it into multiple copies of the same message each with a subset of the main recipient listxe2x80x94that is, the set of recipients is divided into xe2x80x9cnxe2x80x9d roughly equal pieces, and each such piece gets a copy of the entire message being distributed. Although this offers some degree of improvement in parallelism of delivery, it also takes up more disk space as xe2x80x9cnxe2x80x9d copies of a (possibly large) message are placed into the queue. As xe2x80x9cnxe2x80x9d increases, delivery parallelism improves but increasing resource consumption causes the overall performance to degrade. The balance can be quite delicate, if not perilous. Listmanager is also somewhat bound to using the Sendmail(copyright) open source MTA, thereby limiting user choice in selecting which vendor""s MTA and MLM best serve the needs of a given environment. Accordingly, a better solution is desirable.
An electronic mail (xe2x80x9ce-mailxe2x80x9d) system includes one or more clients connected over a network to at least one Message Transfer Agent (MTA), that is, the program responsible for delivering e-mail messages. Upon receiving a message from a Mail User Agent or another MTA it stores it temporarily locally and analyses the recipients and either delivers it (local addressee) or forwards it to another MTA (routing). Communication occurs through a standardized protocol, such as SMTP (Simple Mail Transport Protocol) in the context of the Internet. Often when sending e-mail, a distribution or xe2x80x9cmailing listxe2x80x9d is employed to facilitate the process of sending an e-mail message to a group of people. For instance, instead of addressing an e-mail message to individual members of a recurring group, a user can instead simply define a mailing list to comprise those members. Upon receiving an e-mail message sent to a predefined mailing list, the system""s MTA hands off the message, with the name of the list, to the system""s Mailing List Manager or MLM. After checking the message (e.g., privacy checking and verification that the message is legitimate for distribution), the MLM enumerates the individual recipients for the list and hands the message with a list of the specific intended recipients (i.e., with the names/e-mail addresses of the specific intended recipients attached) back to the MTA for redistribution. In this fashion, a mailing list can be used in the recipient field for an e-mail message, in lieu of listing individual members, so that a message sent to this distribution list goes to all recipients listed. However, as a particular mailing list grows larger, it becomes progressively more resource-intensive and time-consuming task to manage and process.
An electronic mail system of the present invention includes methodology for processing messages sent to mailing lists, particularly large ones, in an efficient manner. The solution of the present invention is to include an xe2x80x9cInjectorxe2x80x9d component. Here, an electronic mail system constructed in accordance with the present invention includes an MLM connected to an MTA through an Injector. At a high level, the purpose of the Injector is to inject messages into the MTA, or multiple MTAs.
For a given mailing list, the system processes each recipient address as follows. The MLM, acting through the Injector, posts the address to a first MTA. If that MTA successfully processes the address, it responds with a xe2x80x9csuccessxe2x80x9d result, which may be passed back through the Injector to the MLM. If, on the other hand, that MTA is not successful, then the address is passed off to a second MTA. Again, if that MTA is successful, it will indicate that success back to the MLM; otherwise, the address is then passed off to the next MTA. At this point, the Injector tries to assign each address to an available MTA before the body of the message goes anywhere. In other words, the Injector attempts to find a xe2x80x9chomexe2x80x9d for each address. Only after all addresses have been assigned to an outgoing MTA does the actual message data get handed off.
The foregoing sequence continues until either the address for the given recipient is successfully processed by one of the MTAs or all of the available MTAs have been exhausted. In the event that all of the available MTAs fail, the address is then ultimately passed on to the fallback MTA, which will indicate initial success and assume any responsibility for queuing the message for that recipient. From the perspective of the MLM, therefore, each recipient of the mailing list has been successfully handled. In an exemplary configuration, each of the remote or external MTAs would reside on relatively powerful server machines. Since the fallback MTA is configured to never reject an address, all addresses should initially be processed by the available remote MTAs, until those MTAs are exhausted. In that manner, the fallback MTA is reserved for only those addresses rejected by all of the remote MTAs. Once all addresses have been assigned to exactly one MTA, fallback or otherwise, the body of the message is passed to the MTAs for delivery to the recipients assigned to each.
During system operation, the remote MTAs and fallback MTA are specified by a configuration file for the Injector. Before performing address distribution, the Injector attempts to establish a connection with each of the remote MTAs. Any MTA that is down can be immediately detected by the Injector as that MTA will be unable to establish a connection. In such a case, the Injector adjusts the address distribution so that it is spread among the remaining MTAs (i.e., those able to successfully establish a connection). The underlying design provided by the present invention affords flexibility to incorporate any combination of internal and/or external MTAs, as desired by a system administrator for a given deployment.
By dividing work among available MTAs, the system of the present invention is able to achieve optimal distribution of workload for the system. In the event of a failure at one of the MTAs, that MTA""s task may be instead distributed to the other remaining MTAs, that is, applying load balancing technique for handling an MTA failure. Further, since the Injector decouples the MLM from the MTA, the MLM can be ignorant of the interface for the MTA, thus allowing the MLM to remain constant.