The demand for data storage has been rapidly escalating because as the amount of data such as digital media stored by users grows, so does their need to store digital media reliably over extended periods of time. Traditional backup solutions periodically copy data to, for example, backup tapes, compact discs (CDs), or other local storage media. However, such solutions are not optimal because the backup media is stored in a single location, and the media being used for backup has typically been prone to failure.
Commercially available services that are referred to as cloud storage services (CSS) provide mass storage through a web service interface available through the Internet. The storage infrastructure includes a distributed array of geographically distributed data centers connected to a plurality of clients through a wide area network (WAN). A data center typically consists of servers and mass storage to facilitate cloud storage services to the clients. Such services enable applications including, for example, backup and restore of data, data synchronization, file sharing, and so on.
Cloud storage services are accessible to users from anywhere in the world, through a client implementing a web services' interface designed to at least synchronize data with the data centers. However, such web services fail to provide standard file sharing protocols (e.g., common internet file system (CIFS) or network file system (NFS)). In addition, accessing files stored in the cloud storage from the LAN is typically many times slower than accessing files on the same LAN that are simply stored in local storage devices.
FIG. 1 shows an exemplary diagram illustrating an infrastructure of a cloud storage service (CSS) 100 according to existing solutions. The CSS 100 includes a metadata database (MDB) 110, application servers 120, an object storage system 130, a client 140, and a scanner 150. A client 140 accessing the CSS communicates with one or more application servers 120. The client 140 may be a storage appliance that provides an access cloud storage service and enables storing locally saved data in the cloud storage service.
An object storage system 130 is a system of a cloud storage provider. The object storage system 130 includes a plurality of object storage devices. An object storage device (OSD) is a computer storage device that organizes data into flexible-sized data containers, called objects, instead of providing a block-oriented interface that merely reads and writes fixed-sized blocks of data. Each object saved in the object storage system 130 is identified by an object identifier (01D), which typically is then used to retrieve data from the system 130. Although not illustrated in FIG. 1, a plurality of object storage systems 130 may be included in the CSS 100, each of which belongs to a different storage provider, and which may or may not be co-located with the MDB 110. Furthermore, the CSS 100 may include other non-object storage systems such as file servers.
When a file is saved in the CSS 100, it is typically split into a number of data blocks, which may be of fixed or of variable size. A filemap is saved as an object of the object storage system 130. The filemap includes a list of block codes needed for later reconstruction of a split file. The data blocks are saved as objects (either one block per object, or multiple blocks per object) in the object storage system 130, while metadata of each block is kept in the MDB 110. The metadata may include a block size, a reference count, a Block Code, and an object ID (OID). The OID is the block's location in the system 130, while the Block Code is derived from the block contents by means of a one way hash function. A reference count is a parameter that maintains the number of file maps which reference the data block. Each data block has its own reference count value saved in the MDB 110. Therefore, maintaining a correct MDB 110 is required for data persistency and to avoid data corruption.
However, in certain instances, the MDB 110 may maintain incorrect information and pointers to data blocks stored in the storage system 130. That is, in such instances, the MDB 110 and the system 130 are out of synchronization. This may occur when, for example, the MDB 110 is recovered from a backup to an earlier version, when one of the object storage devices in the storage system 130 is restored from a backup to an earlier version, or when an object loss occurs in the system 130 due to a technical malfunction.
A MDB 110 being out of synchronization may result in a few problems including, for example, broken references, orphan objects, and an incorrect reference count. The broken references are blocks that are designated in the metadata contents saved in the MDB 110, but that do not exist in the system 130. Broken references cause data corruption. For example, a block A with a broken reference will be reported to the client 140 as if it is already saved in the CSS 100. Therefore, the client 140, when uploading a file which should contain block A, would in reality upload the file to the CSS 110 without block A. Thus, the new file would be stored with a missing block (block A) and yet the write operation would still be considered successful, thereby causing a silent corruption of data.
An orphan object is an object of a respective data block saved in the system 130, but without corresponding metadata and/or pointers in the MDB 110. Orphan objects result in a waste of storage space, as such objects cannot be accessed by the client 140. Likewise, an incorrect reference count value, which may result from a broken reference, causes a resource leak, as data blocks cannot be deleted from the storage system 130.
In summary, a MDB 110 being out of synchronization can significantly degrade the performance of the CSS 100 and cause data corruption as well as waste of storage resources. Therefore, a critical mission in the CSS 100 is to re-synchronize (or reconcile) the MDB 110 with the object storage system 130.
A prior art solution for reconciling of a MDB is to completely list the contents of an object storage device in the system 130 while comparing the listing with the contents of MDB 110 to identify broken references, orphan objects, and incorrect reference counts. The MDB 110 must be taken off-line until the scanning is completed; otherwise, new silent data corruption may occur as described above. That is, data blocks cannot be saved or retrieved from the CSS 100. This process usually requires a prolonged time (e.g., hours or days) until completion, thereby causing a lengthy service disruption. The reconciliation of the MDB 110, as performed by prior art techniques, is carried out by scanner 140. The scanner 140 is communicatively connected to the object storage system 130 and the MDB 110.
It would be therefore advantageous to provide an efficient solution for reconciling of the MDB which does not require stalling the operation of the CSS.