In the field of data storage, enterprises have used a variety of techniques in order to store the data that their software applications use. At one point in time, each individual computer server within an enterprise running a particular software application (such as a database or e-mail application) would store data from that application in any number of attached local disks. Although this technique was relatively straightforward, it led to storage manageability problems in that the data was stored in many different places throughout the enterprise.
Currently, enterprises typically use a remote data center or data centers, each hosting a storage platform on which to store the data, files, etc., of the enterprise. A computer server of an enterprise may now host many different server applications, each application needing to store its data to the data center or centers. Accompanying this huge amount of data is the metadata that helps to describe the data including the location of the data, replication information, version numbers, etc. Because of the quantity of metadata, the techniques used to store it, and the need to access the metadata, numerous problems have arisen.
For example, because the metadata for a particular write operation or for a particular block of data may be stored on two or more metadata nodes, it is important that this metadata be synchronized between nodes, i.e., the metadata for a particular write should be the same on each metadata node. Various events can affect synchronization: a time delay between the storage of metadata on nodes or between data centers in a distributed storage system can allow an application to inadvertently read old metadata; a metadata node failure either means that metadata may not be read from that node or that the node will contain stale metadata when it comes back online; a disk failure of a metadata node also prevents reading of the correct metadata; a data center failure prevents reading of metadata from that center or means that a strong read of metadata cannot be performed; and, metadata may become corrupted in other manners.
Synchronization should be performed quickly but many prior art techniques do not perform synchronization fast enough or are disadvantageous for other reasons. By way of example, the Dynamo database used by Amazon, Inc. uses a Merkle Tree to identify differences between two metadata sources. A Merkle Tree is built by hashing pieces of input metadata, then hashing the resulting hashes into higher-level hashes and so on recursively, until a single hash is generated—the root of the Merkle Tree. Maintaining such a structure requires more meta-metadata and keeping it updated as metadata modified is computationally intensive.
The Rsync tool is used to synchronize files between two servers. It uses less metadata than a Merkle Tree-based approach, but, because it is oblivious to the data that it is synchronizing, it cannot be used in a bidirectional manner. In other words, it cannot be used to merge changes introduced at the two ends of the synchronization. The Rsync tool requires one of the servers to have the final version of the file which is to be copied to the other server. The technique known as Remote Differential Compression (RDC) available from Microsoft Corporation minimizes the copying of data by using data not in the file being synchronized but already present at the destination. As in the Rsync tool, however, RDC synchronization is unidirectional.
Accordingly, a synchronization technique for metadata is desirable that is faster, uses less meta-metadata, less storage, less bandwidth and allows for bidirectional synchronization.