1. Field of the Invention
The present invention relates in general to the field of information handling system network storage, and more particularly to a system and method for managing replication in an object storage system.
2. Description of the Related Art
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Large scale object storage systems, such as the DX6000 developed by Dell Inc., store information in a network “cloud” by using a universally unique identifier (UUID) token to store and retrieve the information. In order to prevent data loss, object storage systems may provide content replication between independent network locations, such as with many-to-many replication. In some instances, an application provides redundancy across network sites via multi-site writes, while in other cases, the storage subsystem provides redundancy across network sites by replicating objects at different network sites. Object storage systems protect against data loss by using RAID, RAIN or content replica-based policy storage to address data redundancy challenges at each network site location. With a content replica-based storage policy subsystem, a content addressed storage (CAS) policy typically replicates content based upon the UUID of the content and a cluster level policy that sets the number of replicas. For example, with a typical replica policy each cluster replicates each object at least twice at each independent network site. Creating redundant copies of the same object increases storage costs by eating up storage space, however, provides greater protection against potential data loss presented when only one copy is maintained.
Although cluster storage advantageously improves data security and flexibility, one difficulty with content addressed storage in a “cloud” network environment is managing the number of replicas where storage of a particular object is not tied to a physical storage device. This allows content objects to be distributed and re-distributed to enable load balancing by assigning a UUID token for content object access to each object written to object storage. Having multiple replicas at each site of network storage adds significant costs since each independent site lacks a co-relation between an object copy of different sites once replication is completed. Hence, if different independent sites replicate content to each other with two or more copies at each site, the number of replicas grows exponentially increasing total storage requirements. By comparison, applications that have no binding between sites and have a replica count set at 1 for a site can experience a silent data loss. For example, if the application is keeping a single replica at a remote site and a storage system failure occurs that results in a lost or corrupted replica, the failure may go unnoticed until the application attempts to access the data. End users of a content addressed storage system face the difficult choice of reducing costs by having one replica per site and accepting the risk of data loss, or accepting increased costs by having multiple replicas of content at each site in order to reduce the risk of data loss. For example, in one common configuration, two copies of a content object are maintained at a source site directly accessed by an application with two copies at each replica site so that the number of replicas grows to exponentially increase required storage size for a given set of data.