It is known in the art to use cloud data storage technologies in order to store data on remote, abstract locations, and subsequently retrieve the data, using communication devices connected to a network. In order to proceed to the storage of data, clients typically connect to a cloud storage provider with the communication device, through an application, plugin, API, Web browser, or the like, which stores the data (such as regular filesystem files (e.g. NTFS, EXT3/4, or the like), database records, objects, or the like) on its own servers or datacenters, using known storage technologies (such as mirroring, SAN, RAID or the like) to provide resiliency. When using cloud data storage technologies, storage and retrieval of the data is fully handled by the cloud storage provider, such that users are not concerned about material and security aspects of data storage, such as, for example and without being limitative, the hardware equipment, physical location, precise security measures or other requirements of the servers or datacenters where the data is physically stored.
Known cloud data storage technologies however tend to suffer from several drawbacks. For example, given that client data is stored directly on the infrastructure of the cloud storage provider, the cloud storage provider commonly has access to the data. Hence, clients must trust the security practices, technology and personnel of the cloud service provider with regards to the confidentiality of the stored data. In some cases, it is possible to use encryption to improve security, but even then, the cloud storage provider is generally required to access at least some information. For example, in some instances, the cloud storage provider is required to have access to the encryption key to encrypt and decrypt the data. Even in cases where the provider theoretically never has access to the encryption/decryption keys protecting the client data, it is still required to have access to some unencrypted metadata (e.g. file names, directory structures, modification dates, etc.), which can include important privileged information (e.g. client list, nature of a contract, projects being worked on, etc.), to locate and access the data upon request. Therefore, even when encryption is used, the client must have a certain level of trust in the security practices of the cloud storage provider.
Moreover, given the particularities of each cloud data storage provider, it is not convenient for clients to switch from one cloud data storage provider to another or to distribute their data over multiple cloud storage data providers. For example, in order to distribute data over multiple cloud data storage providers, it is generally required for a client to manually partition the data along some known boundaries (e.g. subdirectories, database key ranges, etc.) and assign specific portions to individual providers. Moreover, if a client subsequently wants to switch between cloud data storage providers (for the entirety or a portion of the data), the client is generally required to manually copy the corresponding data from one provider to the other and ensure that all users accessing that data now use the new provider. During such a copy of the data, updates must be suspended or directed to only one of the two providers to ensure integrity. Moving data from one provider to another can also cause failures and loss of metadata as a result of varying services and restrictions of the specific cloud data storage provider.
One skilled in the art will understand that it is possible to use aggregators (i.e. agents which operate between the client and the cloud data storage providers) to alleviate some of the above-mentioned shortcomings. However, the use of aggregators once again raises confidentiality issues, given that the aggregators generally require access to the data and therefore need to be trusted on the same level as a cloud storage provider would be. In addition, in such a case, performance of the cloud base storage services depends on the aggregator's services and any service failure or performance problem of the aggregator will result in issues for all of the cloud base storage services.
It will be understood that many of the above-mentioned issues also apply to local data storage or network data storage. For example, when a user wishes to switch from a personal hard disk to another, the user is required to copy all the data from one disk to the other while no updates are being performed. If multiples disks are used, the user is required to manually decide which files or directories should be stored on which disk. In addition, Filesystems support which varies between platforms often can cause failures or result in a loss of metadata when moving/copying data from different filesystems.
In view of the above, there is a need for an improved system and method for structuring and storing data in a distributed environment, which would be able to overcome or at least minimize some of the above-discussed prior art concerns.