Field of the Invention
Embodiments presented herein generally relate to distributed storage and, more specifically, to read operations in a tree-based distributed file system.
Description of the Related Art
Managing a file system generally requires managing a large amount of metadata about objects in the file system. Such metadata includes information such as file sizes, directory structures, file permissions, physical storage locations of the files, and the like. In order to back-up file system information, it desirable to frequently and quickly “clone” or “snapshot” the metadata stored for a given file system. However, due to the manner in which that metadata information is typically stored, it is often difficult to frequently and quickly clone the metadata for a file system.
Frequently, such metadata is stored in a “flat” data store such as a NoSQL store (NoSQL stands for “Not Only Structured Query Language”). In such a “flat” store, each item in the store can be accessed directly from a starting object (e.g., a root node). Quickly cloning the metadata stored in a flat store is difficult because each item in the store needs to be copied. Because the number of metadata entries can grow to be high (e.g., millions of entries), copying the metadata becomes very time-consuming, which prevents the file system from being quickly copied.
One could avoid copying each node in a flat store by simply creating a root node copy that includes a reference to the original root node. Modifications to the metadata would then be made by creating new entries corresponding to those modifications, and updating the pointers from the root node copy to point to the new entries.
One issue with doing this, however, is that after many such cloning operations, the access time to nodes in the original flat store become very high, since this approach generates long chains of root nodes. Eventually, in order to reduce the access times, a coalescing operation can be performed, where each item in the original data store is copied each root node, so that each root node has a full set of metadata entries. However, because the number of metadata entries can be quite high, as described above, such coalescing operations result in at least some of the cloning operations requiring a large amount of time to complete.