The present invention relates to a system and method for content addressable storage and, more particularly, but not exclusively to a system and method for reducing the amount of data transfer during replication in a content addressable storage system.
Content addressable storage, CAS, also referred to as associative storage, is a mechanism for storing information that can be retrieved based on the content rather than on the storage location. CAS is typically used for storage and retrieval of fixed content and for archiving or permanent storage.
In Content Addressable Storage, the system records a content address, a key that uniquely identifies the information content. A hash function is typically used as the key to identify the data content, and the quality of the system depends on the quality of the hash function. Too weak a hash function may lead to collisions between different content items.
A typical CAS storage space has access nodes through which input and output is handled and storage nodes for permanent storage of the data, and CAS metadata allows for content addressing and retrieval within the system.
Often the storage space requires to be backed up, and thus a replication of the source space is constructed at a destination location. The source and destination spaces are often not located physically together and there may be bandwidth limitations and latency involved in communication between the two. For the purpose of replication, nodes at the storage space are required to provide reliable copies of input data to the destination, and any system has to allow for failures at one location or another in a way that takes account of the latency within the system. It is furthermore desirable to avoid unnecessary data transfer in view of the limitations on bandwidth.
It is noted that the CAS storage space considered here is a system that is internally a CAS system but looks like a standard data storage block to external applications. That is to say, CAS metadata such as hashes are not generally available externally to the memory block.