1. Field of the Invention
The present invention relates to distributed storage of data in a secure and fault-tolerant manner that enables recovery of such data from distributed data nodes using fault-tolerant recovery techniques.
2. Description of the Related Art
Data storage and retrieval technology requires the availability of data in a timely manner. Basic data storage techniques involve generating a copy of the original data as a backup: such backup systems include simultaneous copying to two or more storage locations (e.g., simultaneous copying to two hard drives), and archival of data. Data archival has progressed from tape backup systems, to backups using compact disc (CD-R) technology, etc.
Such storage and retrieval techniques are substantially inefficient in terms of processing requirements, disk space, and time constraints. For example, distributed storage systems typically maintain duplicate copies of data on multiple, disparate machines to avoid failures in the case of one or more nodes fails. The distribution of duplicate copies, also referred to as r-replication, copies the data, in whole, among R separate storage devices in a system. In case of a failure, any one of the nodes may service a request for data.
The use of r-replication may be effective for closed storage services, such as servers having a Redundant Array of Inexpensive Disks (RAID), also referred to as RAID servers, or corporate mirroring servers. However, r-replication cannot be implemented efficiently in ad hoc or unreliable networks such as the Internet because each replication substantially increases the total storage requirements of the data; hence, typical implementations of r-replication use a minimal number of copies (e.g., a RAID 0 system uses only two copies (R=2).
In particular, use of an r-replication system is extremely inefficient if a given storage device is available on average only fifty percent of the time: if two storage nodes have a fifty percent availability, then the aggregate guaranteed data availability is limited to seventy-five percent for two copies (R=2). In other words, in order to guarantee ninety-five (95) percent availability, five copies (R=5) of the data would be required, effectively limiting the storage capacity of a system to twenty percent its total capacity. Further, the necessity of multiple read requests ordered sequentially to the duplicate storage devices substantially reduces the throughput of the system, especially each time a read request fails.
Another problem in using r-replication involves rogue nodes that maliciously or inadvertently return incorrect data to a requesting source (e.g., due to read/write errors or transmit/receive errors). Security against rogue nodes requires additional redundancy within the system, requiring an even higher number of duplicate copies to be added to the system.
Other problems associated with data storage involve large scale recovery of data, for example due to a disaster recovery scenario. Typical systems that rely on a centralized data store run the risk of complete data loss in the event the data storage is damaged or destroyed. Hence, conventional redundancy-based replication systems may be ineffective in the case that all the data stores are located within a specific geographic area having encountered a disaster (e.g., fire, etc.).
Still another problem associated with data storage involves the relative portability of data and the ability of users to access the data from different locations. One example involves an enterprise system having multiple offices, where a user moves from one office to another office. Conventional systems require a complete reconfiguration of the user's portable computer before access to any data store (e.g., e-mail) is possible.
The foregoing illustrate the difficulties encountered in providing an effective data recovery system in a disaster recovery scenario. Existing recovery techniques for recovery of data stored on large-scale data servers has required deployment of complex back up storage techniques, including tape drive backup systems. Tape drive backup systems, however, often require that the server be taken “off-line” in order to back up the stored data. Moreover, the tape medium used to back up the data usually is stored at the premises alongside the server; hence, if a data center encountered a disaster similar in scale to the World Trade Center attack or a natural disaster such as an earthquake or hurricane, both the data servers and the tape backup would be lost.
Even in cases where the tape medium is stored at a secure location that survives the disaster, data recovery is still a substantial effort: new data servers and tape drive recovery systems must be purchased and installed at a new site, the tape medium must be recovered from its secure location and loaded into the tape drive recovery system such that the backup data can be loaded from the tape medium onto the new data server. As apparent from the foregoing, however, such a system suffers from the disadvantage that substantial delays still may be encountered in establishing the new data server, even extending to days or weeks depending on the availability and acquisition of the new site, the tape medium, the new data server, and personnel to deploy the new data server.
In addition to the inevitable delay encountered in establishing the new data server, the most obvious problem encountered by users logging onto the new data server is that much of the most critical data for the users will either be unavailable or dated based on the last backup onto the recovered tape medium. Consequently, users will have lost their most recent data unless they have made their own backups onto their personal computers.
Storage of data on personal computers (e.g., laptop computers) also may not provide an acceptable solution in disaster data recovery due to the inherent inability of existing computers to automatically merge file systems. In particular, storage of data files on a local computer limits the availability of the data files to other users; in contrast, storage of the data files on a data server limits the availability of the data files when the user is disconnected from the network. Further, existing technologies would require users to manually copy the data files stored locally onto the new data server, which may result in errors or incomplete copies; further, problems arise if different users accidentally overwrite newer files with older versions of data files.
FIG. 1 is a diagram illustrating a conventional (prior art) system 10 in a computer, for example a personal computer or a laptop computer. The prior art system 10 includes a file system 12, a file allocation table (FAT) 14, a network file system (NFS) module 16, a Server Message Block (SMB) driver 18 and a File Sharing module 20. As described below, the file system 12 is configured to access one of the modules 14, 16, 18, 20 for a file requested by an application process 22 based on the corresponding name of the file; hence, the file system 12, upon attempting to open a file, already is able to identify which of the modules 14, 16, 18, or 20 to use to open the file based on the file name. Consequently, external nodes have a different view of the data compared to local applications 22 accessing data via the file system.
The SMB driver 18 utilizes SMB protocol for sharing files, printers, and serial ports, and communications abstractions between computers. SMB is a client server, request-response protocol. Note that if the SMB driver 18 were do employ caching, the SMB driver would implement redirect objects 24 to send requests back to the file system, enabling the SMB driver 18 to reach the FAT 14 via the file system 12. The NFS protocol used by the NFS module 16 is a network file system protocol initially established according to RFC 1094.
The File Sharing module 20 is a Microsoft Windows service that enables access of remote files or directories on remote nodes. As new nodes are added to the network, the new nodes may appear as additional network elements, however there is no means for automatically enabling a contribution of files from those newly added nodes into a collective organization of files.
Hence, there is no ability in the prior art for a piecemeal restoration based on incremental adding of clients to a network.