1. Field of the Invention
The present invention relates to methods of managing, validating, retrieving, and reconstructing digital data in a piecewise manner.
2. Background Art
Digital data storage management includes provisions for managing user data units originally received from a using system, validating that user data unit whenever it is retrieved, and maintaining a required level of data reliability. Such storage management is currently addressed via a number of mechanisms. Such mechanisms include providing metadata useful for identifying the location of the original user data unit, verifying correctness of the original data as it is retrieved, and providing additional data (i.e., redundant data) that can be used to recover (i.e., correct or recreate) any parts of the original data found to be missing or incorrect (either by outright loss or by being damaged in some way). The metadata is generally managed separate from the data but the data and the redundant data are most often managed via some version of Redundant Array of Independent/Inexpensive Disks (“RAID”) structures. Such RAID structures include RAID1 (mirroring), RAID3 or RAID5 (parity), or multiple redundancy placed into one of these RAID structures such as Reed Solomon. In each case, the intent is to add metadata and some additional data (thus the term redundancy) to the storage system and manage the additional data in such a way that loss or damage to any part of the original user data is extremely unlikely to also result in a loss or damage to the redundant data. Therefore, the redundant data is available to recover original user data in order to reconstruct data that has been lost or damaged. The primary problem with these methodologies is a cost and performance tradeoff that users must accept. The tradeoff is measured in terms of both the granularity of the recovery options and in the cost of the processes involved in the recovery of data. The granularity of recovery relates to the notion that data is received and managed in some blocked format. One example is to note that a user data unit is a set of data known at the user level outside the storage subsystem (e.g., a dataset or a data file) and communicated to the storage subsystem by an agreed upon name. The user data unit has boundaries that are managed in the using system rather than the storage system. However, such a user data unit is received from the using system one small piece (e.g., one record or one 512 byte segment) at a time. The usual redundancy process is to create the metadata and the redundancy data for the over all envelope of the user data unit received and associate it with the agreed upon name. Therefore, it is necessary not only to have significant redundant data (e.g., in the case of mirroring which is explained in more detail below, redundancy includes whole copies of files), but also to manage retrieval on the basis of utilizing these redundant data in a whole data unit context, because the metadata is also managed in that context.
Mirroring is the simplest process to provide redundant data, and requires the simplest metadata since it is simply the location of an additional copy (copies) of the data. Mirroring provides the highest performance option when redundant data is placed in the same level of the storage hierarchy as the initial data but is the most expensive in terms of capacity used and network traffic to accomplish the writing since the data must be sent to two different locations for storage. When the mirror data is placed in a lower level of the storage hierarchy (e.g., backup data placed on tape) the cost is reduced but the access time is increased. The other RAID options are less expensive than mirroring with respect to capacity utilization and network traffic for writing data until a recovery operation is required during data retrieval. At the time of retrieval, if reconstruction is required and mirroring has been used, the retrieval is simply redirected to the alternative copy of the data. However, with the data parity or multiple redundancy options of RAID3 or RAID5, a large amount of data must be accessed and provided to a reconstruction process. This results in a response time to access the data that is slower than for mirroring.
One mechanism for determining whether a given unit of data is damaged and needs to be reconstructed is by evaluation of digital signatures and/or hashes that are metadata created and associated with data as it is being stored. The failure of a given set of data to exhibit the correct digital signature when compared to the digital signature or hash generated at storage time provides an indication the data must be regenerated from redundant data. Again, such reconstruction typically requires that a large amount of data be accessed.
Accordingly, there exists a need in the prior art for improved methods of managing and reconstructing data.