1. Technical Field
The present invention relates generally to techniques for highly available, reliable, and persistent data storage in a distributed computer network.
2. Description of the Related Art
A need has developed for the archival storage of “fixed content” in a highly available, reliable and persistent manner that replaces or supplements traditional tape and optical storage solutions. The term “fixed content” typically refers to any type of digital information that is expected to be retained without change for reference or other purposes. Examples of such fixed content include, among many others, e-mail, documents, diagnostic images, check images, voice recordings, film and video, and the like. In storage systems including a Redundant Array of Independent Nodes (RAIN), a storage approach has emerged as the architecture of choice for creating large online archives for the storage of such fixed content information assets. By replicating data on multiple storage systems, which include multiple nodes, the storage system archives can automatically compensate for node failure or removal. Typically, RAIN systems are largely delivered as hardware appliances designed from identical components within a closed system. The closed system may involve one or more storage systems connected over a network. To replicate data on multiple storage systems, systems of the prior art the archive system would send the entirety of the data (object) payload, including the data content and associated metadata, to the other storage systems for replication. However, sometimes collisions between a replicated object and another object on the replication target storage system may occur.
A method of recovery of a primary cluster is also known in which a replica cluster sends metadata of an object to be recovered to the primary cluster and the primary cluster starts to receive access from a client of the primary cluster for the data associated with the metadata. In this method, the primary system to be recovered receives metadata first, which then allows a client to access the data even though the content data associated with the metadata has yet to be transferred to the primary cluster using a read from replica process. This method is described in U.S. Pat. No. 8,112,423, which is incorporated herein by reference