The enterprise computing landscape has undergone a fundamental shift in storage architectures in that central-service architecture has given way to distributed storage clusters. As businesses seek ways to increase storage efficiency, storage clusters built from commodity computers can deliver high performance, availability and scalability for new data-intensive applications at a fraction of the cost compared to monolithic disk arrays. To unlock the full potential of storage clusters, the data is replicated across multiple geographical locations, thereby increasing availability and reducing network distance from clients.
In such systems, distributed objects and references are dynamically created, cloned and deleted in different clusters (using a multi-master model) and the underlying data replication layer maintains the write-order fidelity ensuring that all clusters will end up with the same view of data.
Some current visualization, multimedia, and other data-intensive applications use very large objects—typically hundreds of gigabytes or even terabytes. Uploading of such objects into a distributed storage system is typically done in the streaming mode by splitting the object into chunks and uploading each chunk individually. This process can impose long delays in the upload time that may be exacerbated by potential client and/or server failures. Consequently, efficient distributed object uploading in the streaming mode that provides consistency guarantees is becoming increasingly important for the storage industry being driven by the needs of large-scale systems that allow clients to connect to any cluster available at a time.