Existing remote file hosting services permit users to store data remotely and achieve a high degree of data access and data protection as well as essentially unlimited storage. However, the asynchronous nature of updates for files stored with these existing hosting services raises challenges for maintaining data consistency for frequently modified files especially when the remote storage is virtualized locally. For example, at least partly because of scaling issues associated with Brewer's Conjecture, wide area binary large object (blob) stores trade off consistency against availability such that reads of previously written data are not guaranteed to return the most recently written version of that data. Further, in some existing systems, both synchronous and asynchronous write operations are completed to the remote storage before the write operations are reported as successful. That is, the write operations block until the data has been successfully delivered to the remote storage. In the event of loss of all physical paths to the remote storage, the write operations may simply hang until the operation aborts.
Additionally, while the monetary costs of remote storage with the existing hosting services may be small, the cost per input/output (I/O) operation (e.g., bandwidth) may be high at least because the charges for I/O operations are not linear with the size of the transfer. For example, there is often a high base charge for the first byte transferred, and some existing hosting services do not support partial blob writes. Rather, these existing hosting services require rewrites of entire blobs even when changing just a single byte.
Further, the time required to complete each I/O operation with the existing hosting services may be significantly greater than traditional enterprise class storage devices. For example, access times over networks such as wide area networks are highly variable with heavy or fat tailed distribution.
As such, with some of the existing hosting systems, unplanned or unexpected data usage patterns may lead to high data storage costs and reduced responsiveness.