1. Technical Field
The present invention relates generally to data storage, and more specifically to the storage of bulk electronic mail messages and other data entities broadcast as multiple copies.
2. Description of the Related Art
Electronic mail has long been known to be a convenient medium for sending a message to multiple recipients. Even before the mass commercialization of the Internet in the mid-late 1990s, it was common for Internet users to subscribe to automated mailing lists for the purpose of conducting round-table discussions or distributing newsletters through electronic mail. In one of these mailing list systems, an electronic mail message sent to a designated mailing list address is duplicated and sent out to all of the users subscribed to the mailing list. A number of mailing list management programs exist for this purpose, such as the popular “LISTSERV” and “Majordomo” software packages. The programs take care of the subscription (and unsubscription) of users, distribution of messages to subscribers, and archival of list messages. Often a mailing list will post its archive on the Internet in the form of a web page to allow previous messages to be browsed or searched. With the rapid expansion of the Internet into businesses and homes, the electronic mail messages transmitted on a daily basis has grown at an astonishing rate. Not surprisingly, many of these messages are mass mailings, such as newsletters, mailing list discussions, and advertisements.
Because electronic mail is a person-to-person or point-to-point communications medium, each recipient of an electronic mail message receives an individual copy of the message at his/her local electronic mail server, where the message is stored at least until it is retrieved by the user using an electronic mail client, such as LOTUS NOTES or MICROSOFT OUTLOOK. In many instances, an electronic mail server will continue to store the message until the user deletes it, even if the message has been read by the user, thus allowing the user to have access to his/her entire mailbox of messages from any location (by storing them in a central repository). Typically, a single electronic mail server will serve a number of users, and in a large organization, the number of such users may be quite high. A tremendous amount of storage may be required to store messages for that large a number of users.
For mass-distributed messages, this problem is compounded even further, as multiple users of a single electronic mail server may each have a copy of a single message. In the case of “junk mail” messages or “spam,” some electronic mail systems apply a “spam filter” to delete or discard received spam. Spam filters can be a convenience for users, but are not an effective solution to the message storage problem, as many (if not most) mass-distributed messages are not spam and cannot simply be automatically deleted.
What is needed, therefore, is a method for reducing the storage burden of electronic mail systems and other similar software systems having a broadcast or multicast capability. The present invention provides a solution to this and other problems, and offers other advantages over previous solutions.