1. Technical Field
The present invention relates archiving email messages.
2. Discussion of the Related Art
Electronic message or email server systems can be configured to provide journaling of email messages (emails) that are sent and received by users of the server systems. Journaling of emails typically includes placing a separate copy of an email that is sent or received utilizing the server in a dedicated mailbox or database journal during the email delivery process. The email in the journal is a copy of the email that is distributed to the recipients, and may also contain additional information that is not available to the individual recipients, such as a listing of all email recipients in the email metadata (e.g., email addresses in the “To”, “Cc” and “Bcc” header fields) as well as resolved groups.
Examples for archiving emails include, without limitation, archiving emails from the journal for compliance reasons, and archiving emails from individual user mailboxes for space-saving reasons. Archiving of messages typically occurs in the following sequence of operations:                Identifying one or more user mailboxes in which emails should be archived for space-saving purposes;        Searching and identifying messages that qualify for archiving (referred to as crawling);        Extracting the messages in a particular user mailbox that qualifies for archiving; and        Storing the extracted messages in an archive.        
Archiving for compliance typically occurs within the journal immediately or soon after an email has been sent or received for a mailbox in the email server. Journals are typically crawled in short intervals, where all messages in the journal can be archived. Archiving for space-saving in the user mailboxes typically occurs based upon an elapsed time period and can also include other restrictions (e.g., only messages having a certain memory size are archived). A typical example for archiving a user mailbox might be that all messages in the mailbox that have been received 4 weeks ago are archived if such messages still exist in the mailbox (i.e., the mailbox user has not already deleted such messages).
The operation process of crawling can cause a significant load on the server and increase the expense of archiving emails. Thus, it is important to avoid crawling of mailboxes that do not have enough eligible messages that qualify for archiving.
Typical email archiving systems use a declarative approach (e.g., based on time or amount of content in a user mailbox) to determine when a mailbox should be searched for e-mails that need to be archived. For example, a crawling operation to determine which emails to archive might require that all mailboxes for a particular server are searched a selected time period (e.g., every selected number of minutes, every selected number of days, etc.) so that every qualifying email for a particular user mailbox is archived within a selected timespan.
Utilizing a declarative approach to email archiving, a system administrator typically configures a schedule which is used to periodically check if processing is necessary by searching the mailbox for mails that qualifiy for processing. Additionally, all mailboxes are typically treated the same, and the sequencing of mailboxes being processed can be random. This might result in certain user mailboxes not being processed for archiving of emails prior to exceeding a mailbox quota associated with such mailboxes. In addition, this can lead to inefficient conservation of memory space, since some user mailboxes may fill up more rapidly with email content than others. Furthermore, it is too difficult and time-consuming for a system administrator to attempt to configure a separate archiving schedule for different mailboxes based upon how different mailboxes are used.