The demand for data storage has been rapidly escalating because, as the amount of data such as digital media stored by users grows, so does their need to store digital media reliably over extended periods of time. Storage systems to store digital media range from a local storage media (e.g., CDs and backup tapes) and network storage systems (such as NAS or NAT) to cloud-based storage system.
Network storage systems such as NAS and NAT provide access to files to users connected in a local area network though standard file sharing protocols (e.g., common internet file system (CIFS) or network file system (NFS)).
Cloud-based storage systems, also referred to as cloud storage services (CSS), provide mass storage through a web service interface available through the Internet. The storage infrastructure includes a distributed array of geographically distributed data centers connected to a plurality of clients through a wide area network (WAN).
FIG. 1 illustrates a storage system 100 designed to provide cloud storage services. The system 100 includes a distributed array of geographically distributed data centers 110-1 to 110-M (hereinafter referred to collectively as data centers 110 or individually as a data center 110, merely for simplicity purposes) connected to a plurality of clients 120-1 to 120-N (hereinafter referred to collectively as clients 120 or individually as a client 120, merely for simplicity purposes) through a wide area network (WAN) 130.
A data center 110 typically includes servers and mass-storage-facilitating cloud storage services to the clients 120. Such services enable applications including, for example, backup and restore of data, data migration, data sharing, data collaboration, and so on. Cloud storage services are accessible from anywhere in the world. To this end, each client 120 implements a web services interface designed to at least synchronize data with the data centers 110. Applications enabled by the cloud storage services are not typically aware of the specifics of the services and the underlying data synchronization operations. The disadvantage of commercially available cloud storage services is that such services do not implement standard file sharing protocols (e.g., common internet file system (CIFS) or network file system (NFS)). Furthermore, accessing files stored in the cloud storage is typically slower than accessing files stored in local storage devices.
Although not shown in FIG. 1, the storage system 100 may include a plurality of cache servers to accelerate data storage and retrieval as well as cloud agents allowing access to files remotely stored in the data centers 110. A cloud agent may be a hardware component, a software component, or a combination thereof, which is connected to or associated with a specific workstation, server, or other computing device. For example, a workstation agent may be software installed on a personal computer, such as to integrate this workstation with the CSS and/or cloud integrated storage devices. As another example, a mobile device agent may be an application installed on a mobile device, such as a smartphone, acting to integrate the mobile device with the cloud storage system.
The cloud storage system can be utilized to share content between users. For example, in enterprises, data can often be shared between different departments, branches, and individual users. Each such entity that can save or share files is typically assigned, e.g., different permission rules. Furthermore, each user may use a different type of device (node), each of which may be, but is not limited to, a PC, a smartphone, a storage appliance, a file server, and so on. Thus, a folder stored in the cloud storage (a data center 110) can be accessed by multiple different users from different geographical locations. In addition, a user can access the cloud storage from different locations and/or different devices associated with the user.
An essential requirement of a cloud storage system is to synchronize data between local devices and remote storage, between different devices of the same user, and among users that share the same content. Another essential requirement is to provide sufficient data throughout for storage and retrieval of data from any device and/or geographical location accessing the system.