1. Field of the Invention
The present invention relates generally to the field of computer software and client-server applications. In particular, it relates to the organization and storage of e-mail message data accessed by clients over a computer network.
2. Discussion of Related Art
The accelerated growth of network computing in the 1990s has been accompanied by an increasingly prevalent form of communication most commonly referred to as "e-mail." As more individuals, whether at home, in corporations, small businesses, academic institutions, or government organizations, have access to computers connected to some type of computer network, electronic mail is quickly becoming (and in many settings already is) a preferred mode of communication. People find e-mail an efficient and effective way to communicate whether they are sending a simple one-time message or carrying on a long-term discussion or conversation.
While e-mail has been used for years within large entities such as corporations and universities for sending messages within the entity's internal networks and is typically based on proprietary formats and protocols, the Internet is bringing e-mail out of the realm of large enterprises and into the mainstream. Because the Internet is a publicly accessible, global, computer network, it is increasingly being used for its e-mail capability. In addition, the Internet Protocol (IP) and the Internet's communication layers (TCP/IP) are being used to develop computer networks known as intranets within private entities based on IP and TCP/IP instead of proprietary formats and protocols. This approach allows, for example, a corporation or university to have an internal computer network that is compatible with the Internet and has all the features of the Internet, including Web sites, the ability to hyperlink, and, of course, send and receive e-mail.
The explosive growth of the Internet and the growing attraction of intranets has led to a proliferation of e-mail messages. Typically, e-mail messages are received and stored on network servers or on the hard drives of client and stand-alone machines. There is a growing tendency or practice and, in many cases, need, to save e-mail messages electronically and to retrieve them easily when desired. For example, this can be important in any type of research setting where messages containing ideas, comments, or analysis are sent among researchers, possibly in different countries, over a period of several years. For example, it is foreseeable that a certain message sent at a particular time two years ago between two researchers who are no longer available, has to be retrieved. Of course, this capability could also be an important and useful feature in a business environment or in other settings.
The proliferation of e-mail and the increasing number of messages being saved, coupled with the growing demand for retrieving saved messages has exposed problems with current indexing schemes and message. There is a growing trend to save messages on servers instead of on client machines. A mail server acts as a central repository of messages and has the advantages of being backed-up regularly, maintained by an administrator, and of being repaired quickly (in most cases) when broken (e.g. when it crashes). Thus, when a user makes a request, it is handled by the server and delivered to the client.
The composition of an e-mail message today can vary widely as can the type of request. In a simple case, an e-mail message can contain, in addition to required headers, a simple message portion consisting of a few lines of text. On the other hand, an e-mail can have several attachments that may include complex graphics, audio segments, video, animation, text having special encoding (e.g. if in a non-Latin based language), and even other entire e-mail messages.
Requests for messages can also vary in terms of depth and breadth of information desired. For example, a request can be for the entire content of a single message sent several years ago between two individuals. Or, a request can be for a list of recipients of any message sent regarding a particular subject in the last 24 hours, the list distinguishing those recipients who have opened the message(s) and those who have not. In sum, the nature of e-mail messages and of requests for e-mail message data have grown more complex thereby exposing weaknesses of present mail servers in handling message storage and retrieval.
Most mail servers presently used for the type of message storage and retrieval discussed above are configured according to the Internet Message Access Protocol, or IMAP. IMAP is a collection of commands for manipulating messages and indexes for sorting and storing all the information associated with messages and actions performed on them. For an IMAP-configured server to take full advantage of IMAP, information related to users on the network and messages, which includes message content and meta data regarding the message, must be stored in a manner that takes advantage of IMAP indexing. While IMAP servers store data according to IMAP indexing to some degree, none do it in such a manner that optimizes quick, reliable, and non-contentious retrieval and storage of data.
Present IMAP servers experience contention problems and other inefficiencies resulting in poor performance. Although they handle message data as a collection of fields that make up a record, i.e., they are record-based, writing a new message to a user's inbox (the mailbox in which a user receives new mail) will very likely result in locking out the user from performing other operations on the inbox. The message store of these IMAP servers were not designed to efficiently utilize the indexing available in IMAP. For example, a user may only desire information regarding certain fields (e.g. date, recipients, subjects, etc.) from all messages in a mailbox. IMAP servers are likely to retrieve more information than is needed to satisfy typical user requests for data. Thus, to simply get the number of messages sent to a particular user regarding a specific subject, an IMAP server may retrieve the entire content of all the messages to derive the number of messages. Present IMAP servers also lack strong integrity and consistency checking capabilities possible in IMAP.
Others mail server implementations require that an entire message be delivered or copied regardless of what type of information regarding the message is being requested. This problem is similar to VARMAIL, an older file-based mail environment in the UNIX operating system, in which delivery of a message locked out all write operations to a mail folder. This default procedure caused the mail delivery system to be considerably slow. In addition, the VARMAIL environment also required multiple copies of the same e-mail message to be stored in the client machine's memory.
Therefore, what is needed is a server-based message store partitioned such that indexes, message data, and user data are logically arranged to improve message storage and retrieval times, and maintain strong data integrity. The message store should reduce contention and allow users and the server to perform read and write operations on messages and mail folders concurrently. It would also be desirable to increase the level of specificity recognizable by the server when handling users' requests for data so that the server only retrieves data that was requested with reduced extraneous data thereby increasing retrieval speed and saving memory.