The present invention relates generally to electronic mail (e-mail) systems and, more particularly, to improved methodology for distributed processing and storage of e-mail messages.
Today, electronic mail or “e-mail” is a pervasive, if not the most predominant, form of electronic communication. FIG. 1 illustrates the basic architecture of a typical electronic mail system 10. At a high level, the system includes a mail server connected over a network to various e-mail “clients,” that is, the individual users of the system. More specifically, the system 10 includes one or more clients 11 connected over a network to at least one Message Transfer Agent (MTA) 12a. Communication occurs through a standardized protocol, such as SMTP (Simple Mail Transport Protocol), in the context of the Internet.
A typical e-mail delivery process is as follows. In the following scenario, Larry sends e-mail to Martha at her e-mail address: martha@example.org. Martha's Internet Service Provider (ISP) uses an MTA, such as provided by Sendmail® for NT, available from Sendmail, Inc. of Emeryville, Calif. (With a lower case “s,” “sendmail” refers to Sendmail's MTA, which is one component of the Sendmail® for NT product.)    1. Larry composes the message and chooses Send in Microsoft Outlook Express (a “mail user agent” or MUA). The e-mail message itself specifies one or more intended recipients (i.e., destination e-mail addresses), a subject heading, and a message body; optionally, the message may specify accompanying attachments.    2. Microsoft Outlook Express queries a DNS server for the IP address of the host providing e-mail service for the destination address. The DNS server, which is a computer connected to the Internet running software that translates domain names, returns the IP address, 127.118.10.3, of the mail server for Martha's domain, example.org.    3. Microsoft Outlook Express opens an SMTP connection to the mail server running sendmail at Martha's ISP. The message is transmitted to the sendmail service using the SMTP protocol.    4. sendmail delivers Larry's message for Martha to the local delivery agent. It appends the message to Martha's mailbox. By default, the message is stored in:            C:\Program Files\Sendmail\Spool\martha.            5. Martha has her computer dial into her ISP.    6. Martha chooses Check Mail in Eudora, an MUA.    7. Eudora opens a POP (Post Office Protocol version 3, defined in RFC1725) connection with the POP3 server at Martha's ISP. Eudora downloads Martha's new messages, including the message from Larry.    8. Martha reads Larry's message.
The MTA, which is responsible for queuing up messages and arranging for their distribution, is the workhorse component of electronic mail systems. The MTA “listens” for incoming e-mail messages on the SMTP port, which is generally port 25. When an e-mail message is detected, it handles the message according to configuration settings, that is, the settings chosen by the system administrator, in accordance with relevant standards such as the Internet Engineering Task Force's Request For Comment documents (RFCs). Typically, the mail server or MTA must temporarily store incoming and outgoing messages in a queue, the “mail queue”, before attempting delivery. Actual queue size is highly dependent on one's system resources and daily volumes.
MTAs, such as the commercially-available Sendmail® MTA, perform three key mail transport functions:                Routes mail across the Internet to a gateway of a different network or “domain” (since many domains can and do exist in a single network)        Relays mail to another MTA (e.g., 12b) on a different subnet within the same network        Transfers mail from one host or server to another on the same network subnetTo perform these functions, it accepts messages from other MTAs or MUAs, parses addresses to identify recipients and domains, resolves address aliases, fixes addressing problems, copies mail to and from a queue on its hard disk, tries to process long and hard-to-deliver messages, and notifies the sender when a particular task cannot be successfully completed. The MTA does not store messages (apart from its queue) or help users access messages. It relies on other mail system components, such as message delivery agents, message stores, and mail user agents (MUAs), to perform these tasks. These additional components can belong to any number of proprietary or shareware products (e.g., POP or IMAP servers, Microsoft Exchange, IBM Lotus Notes, Netscape, cc:Mail servers, or the like). Because of its central role in the e-mail systems, however, the MTA often serves as the “glue” that makes everything appear to work together seamlessly.        
For further description of e-mail systems, see e.g., Sendmail® for NT User Guide, Part Number DOC-SMN-300-WNT-MAN-0999, available from Sendmail, Inc. of Emeryville, Calif., the disclosure of which is hereby incorporated by reference. Further description of the basic architecture and operation of e-mail systems is available in the technical and trade literature; see e.g., the following RFC (Request For Comments) documents:
RFC821Simple Mail Transfer Protocol (SMTP)RFC822Standard for the Format of ARPA Internet Text MessagesRFC974Mail Routing and the Domain SystemRFC1123Requirements for Internet Hosts—Application andSupportRFC1321The MD5 Message-Digest AlgorithmRFC1725Post Office Protocol version 3 (POP)RFC2033Local Mail Transfer Protocol (LMTP)RFC2045Multipurpose Internet Mail Extensions (MIME) Part One:Format of Internet Message BodiesRFC2060Internet Message Access Protocol (IMAP), Ver. 4, rev. 1RFC2086Hypertext Transfer Protocol—HTTP/1.1currently available via the Internet (e.g., at ftp://ftp.isi.edu/in-notes), the disclosures of which are hereby incorporated by reference. RFCs are numbered Internet informational documents and standards widely followed by commercial software and freeware in the Internet and UNIX communities. The RFCs are unusual in that they are floated by technical experts acting on their own initiative and reviewed by the Internet at large, rather than formally promulgated through an institution such as ANSI. For this reason, they remain known as RFCs even once they are adopted as standards.
Traditional electronic mail (e-mail) systems today are based on monolithic, single-machine configurations, such as a single computer having multiple hard disks. E-mail services are simplest to configure and maintain on a single machine. Such a service, by definition, has a single point of failure. At the same time, however, fairly sophisticated multi-computer hardware is increasingly available. For instance, it is possible to connect together multiple UNIX machines, each running a POP daemon (background process), connected together via a high-speed (e.g., gigabit) network to other computers that, in turn, are connected to disk farms. Despite those advances in computer hardware, there has been little effort today to implement an e-mail system in a distributed fashion—that is, employing a set of machines with a set of disks that cooperate over a network.
Traditional systems have limits in their robustness due to their non-distributed nature. First, traditional systems have difficulty scaling. In a single-machine implementation, scaling the service to meet increased demand involves purchasing faster hardware. This solution has its limits, however. There is usually a strong correlation between e-mail server workload and the importance of 24×7×365 availability. A single server, however large and fast, still presents a single point of failure. At some point on the scalability curve, it becomes impossible or cost-prohibitive to buy a computer capable of handling additional workload. Accordingly, the present-day monolithic systems provide little in the way of scalability or fault tolerance, nor are such systems able to benefit from increased performance afforded by distributed hardware.
As another problem, traditional systems cannot add or remove resources on-the-fly. For single-server environments, additional capacity comes in the form of adding CPUs, RAM, and/or disk resources. Most hardware must be taken out of service during these upgrades, which usually must be done late at night when workloads are light. Eventually, the next upgrade must come in the form of a complete replacement or “forklift upgrade.” Adding additional computing resources in a multi-server environment can be much more difficult. As an example, in “active/standby” clusters (usually just a pair of machines), specification of the standby machine is easy: buy a machine just like the “active” one. These pairs have 50% of their equipment standing idle, however, waiting for the “active” server to fail. Some models allow both servers to be active simultaneously: in the event one fails, the other takes over responsibility for both workloads. While resource utilization rates are much higher in such “active/active” schemes, they are much more difficult to plan for. Each machine must now remain at least 50% idle during peak workload; in the event of failure, the surviving machine must have enough resources to handle two machines' worth of work.
Traditional systems also cannot guarantee immediate and consistent data replication. The problem of consistent data replication is so difficult that most commercial systems do not even try to replicate e-mail message data across servers. Instead, most gain their data redundancy via disk storage using either RAID-1 (disk mirroring) or RAID-4/5 (parity-based redundancy). In the event that a server's CPU, memory, disk controller, or other hardware fail, the data on those redundant disks are unavailable until the hardware can be repaired. To overcome this limitation, a small handful of setups may use dual-ported SCSI or Fibre Channel disks shared with a hot-standby machine or use remote database journaling (again for hot-standby mode). However, some industries, particularly where government regulations and industry-standard practices are the main driving forces, have strong requirements for data availability, redundancy, and/or archiving.
Ideally, one could implement an e-mail system in a distributed manner, with resistance against single points of failure. Here, the information stored on any one computer is not instrumental to continue normal operation of the system. Thus, for example, if one of the servers were to fail, the system would continue to function normally despite that failure. Distributed e-mail systems have been slow in coming, due to the difficulty of ensuring efficient, fault-tolerant operation in such systems. Currently, “semi-distributed” implementations exist, though. For example, the Intermail system (by Software.com of Santa Barbara, Calif.) employs a back-end database for storing mail information. Although Intermail employs multiple machines, it is not truly distributed. Instead, the machines are employed more as proxy servers, rather than as control logic for a central message store. With that approach, however, such a system cannot store information redundantly in an efficient manner.
What is needed is a distributed e-mail system that is maximally redundant, yet is resource-efficient. Moreover, such a system should provide fault-tolerant operation, thereby guaranteeing system reliability. The present invention fulfills this and other needs.