Electronic mail (email) use has become an integral part of people's daily lives. Many forms of communication, personal or business, have been replaced by email exchanges. Emails not only contain textual exchanges, but many modern email systems enable integration of multi-modal communications with emails. Thus, increasing amounts of textual, audio, video, and other forms of communication data is stored in individual mailboxes and central data storage facilities as part of the vast email exchange networks.
In addition to local replication, email related data is also replicated commonly in different locations. With hard disk sizes reaching to Terabytes, traditional raid solutions are rendered impractical. Moreover, geographical, political, and technical disturbances requiring geo-replication of data, email data is frequently replicated asynchronously to multiple physical locations in order to ensure data resiliency under various failure conditions. With the asynchronous nature of such data resiliency solutions comes the challenge as to how an application that pushes new data into email repository can ensure the new content has been committed to sufficient copies to guarantee data resiliency within the existing deployment.
Organizations and service providers typically have data resiliency policies (e.g. how frequently, in how many locations, and which portions of the data is to be replicated). Data replication solutions (e.g. log shipping, hardware based replication solutions, etc.) commonly work independently from applications that put new content into mailbox (e.g. archival services, legal search tools, import/export-mailbox tools, etc.), and these two are unaware of each other.