Hierarchical organization of files and objects is well-known in the prior art. File systems and object storage systems often utilize nested directories (or folders), where each directory can hold other directories, files, or objects. Hierarchical organization is convenient and intuitive. In early computing systems, hierarchical organization of files was a necessity due to the size limitations of system memory. For example, it was not possible to store metadata for thousands of files at one time in system memory, but it was possible to store metadata for one level of a multi-level directory.
More recently, memory size has become significantly larger and hierarchical directories are no longer a necessity for file systems or storage servers, and some storage architectures now use a flat namespace. There are benefits to using a flat namespace instead of a hierarchical namespace. For example, a flat namespace is optimal for get operations. Web servers typically receive get requests with full URLs, rather than context dependent URLs. Web servers use side-indexes to create flat name indexes while still working with hierarchical directories, which allows looking up a long string URL is a single step, whereas navigating hierarchical directories would involve iterative reads. For example, a URL can be resolved more quickly using one vast index of 10,000 flat names as opposed to navigating three layers to one of 100 directories with 100 files each.
Nevertheless, humans still find organizing documents into folders to be quite useful. For example, URLs often refer to hierarchies of folders. Such folders typically were established by the authors of the website to organize their thinking.
What is needed is an object storage system that provides native support of hierarchical namespaces of any nesting level without changing the physical organization of an underlying object storage system to reflect the hierarchy. Reorganizing the actual storage to reflect hierarchical naming would be difficult for a distributed storage system because each layer of the hierarchical directory information would naturally end up on different storage servers. Iterating a hierarchical directory adds time even on a single storage system. Requiring extra network round trip times for each layer of a hierarchical name would add intolerable delay to resolving any object name. A desirable system would provide the benefits of a hierarchical namespace as well as the rapid execution benefits of a flat namespace.
In another aspect of the prior art, it is a general rule for network access storage services that a put transaction must not be acknowledged until the content is safe on persistent storage. The reason for this is so that the loss of a storage server that accepted the put transaction or the loss of a storage device in which the underlying data of the put transaction is to be stored does not jeopardize that transaction during the period beginning with the receipt of the put request and ending with the storage of the content on persistent storage.
Storage servers typically write new content to a sufficient number of persistent storage locations to achieve the required durability for the transaction. These writes take time and delay completion of the transaction. Maintaining a hierarchical namespace typically requires even more persistent storage writes to be performed, further delaying completion of put transactions.
What is further needed is an object storage system that stores a namespace manifest as an object that can be continuously updated and sharded while minimizing the amount of time required to perform and acknowledge a put transaction.