This invention relates to data-storage systems, and in particular, to reducing the number of orphan tracks on a disk.
In a typical data-storage system, the process of retrieving data from a disk in response to a user""s request includes several distinct steps. First, the data-storage system identifies the location of the record containing the data sought by the user. Then, the data-storage system sets aside a portion of memory to receive the data. Only then does the data-storage system actually fetch the data itself.
In order to locate the data and set aside a portion of memory to receive it, the data-storage system must know the nature of the data sought by the user without actually having read the data itself. This is achieved by associating with each record a set of meta-data that describes certain characteristics of that record. This meta-data is typically stored on the disk, as part of the data record that it describes. Upon receiving a request from the user, the data-storage system reads this meta-data before reading the actual data.
Because the meta-data is stored on the disk, a request for data requires two separate disk accesses: a first disk access to read the meta-data, followed shortly by a second disk access to read the data itself. Because it requires movement of mechanical parts, each disk access operation introduces considerable delay. It is therefore undesirable to access the disk more often that necessary.
One method for reducing the delay is to eliminate the first read access by storing the meta-data in memory. In a modern data-storage systems, however, the number of records has become so large that it is no longer practical to maintain copies of all meta-data in memory. Doing so would leave little or no memory available for users.
Commonly owned U.S. Pat. No. 6,330,655, entitled Digital Data Storage Subsystem Including Directory for Efficiently Providing Formatting Information for Stored Records teaches a data-storage system in which records are organized into tracks on a disk. To the extent that the records in a track share a common record format, the meta-data for all the records in that track can be represented in a compressed form. This compressed meta-data can then be maintained in memory without severely diminishing the amount of memory available to users. Although the compressed meta-data must still be decompressed into meta-data before it is available for use, the decompression carried out entirely in memory is much faster than retrieval of meta-data directly from a disk.
The foregoing application defines a finite number of format families. Each format family represents a pattern of records on a track. If the records in a track are laid out in a pattern that corresponds to one of these format families, then the meta-data for all records on that track can be reconstructed by storing only the format family for that track and a few additional parameters which vary from one format family to the next.
In practice, most tracks on a disk have records that are laid out in a pattern corresponding to one of the several defined format families. Nevertheless, there exist tracks in which some or all of the records are laid out in a pattern that does not correspond to any known format family. Such tracks are referred to as xe2x80x9corphan tracks.xe2x80x9d
In an orphan track, the meta-data for at least some of the records on the track is not readily compressible. As a result, the meta-data for those records is consigned to remaining on the disk. Access to those records on the orphan track is thus hampered by the need to perform two separate disk accesses: one to retrieve the meta-data and then another to retrieve the actual data requested by the user. For this reason, it is desirable to reduce the number of orphan tracks and the extent to which those orphan tracks include records whose meta-data patterns cannot be represented in compressed form.
The invention provides for periodically attempting to compress meta-data for records on an orphan track as new format families are added. If the compression is successful, the compressed representation of the meta-data for that track is maintained in memory. As a result, the orphan track loses its orphancy status and all records on that track become as easily accessible as the records on any other track for which the compressed meta-data is fully available in memory.
The invention includes a method for reducing latency associated with accessing a desired record from an orphan track by specifying a compressed representation for a first meta-data pattern. This first meta-data pattern corresponds to a newly-added format family. A second meta-data pattern, which is associated with the orphan track, is then inspected to see if it is consistent with the first meta-data pattern. If it is, then the orphan track belongs to the newly-added format family. A compressed representation of the second meta-data pattern is then generated and maintained in memory.
When the orphan track is found not to belong to the newly-added format family as defined by the first meta-data pattern, the invention optionally includes saving the meta-data associated with the orphan track in an orphan table in memory. The meta-data saved in the orphan table is then periodically analyzed in an effort to identify new patterns of meta-data formats susceptible to compression.
Analysis of the meta-data collected in the orphan table can take place locally. However, such analysis is more likely to be fruitful when orphan tables from a plurality of data-storage systems is transmitted to a data-analysis node for analysis.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods are described below. All patent applications and patents mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the examples described herein are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description and the accompanying figures, in which: