This invention relates to data storage on disk drives and more particularly, to a method and apparatus for retrieving data records stored on a storage medium utilizing a data record locator index stored in memory.
Large disk storage systems like the 3380 and 3390 direct access storage devices (DASD) systems employed with many IBM mainframe computer systems are implemented utilizing many disk drives. These disk drives are specially made to implement a count, key, and data (CKD) record format on the disk drives. Disk drives utilizing the CKD format have a special xe2x80x9caddress markxe2x80x9d on each track signifying the beginning of a record on the track. After the address mark comes the three part record beginning with the xe2x80x9cCOUNTxe2x80x9d which serves as the record ID and also indicates the lengths of both the optional key and the data portions of the record, followed by the optional xe2x80x9cKEYxe2x80x9d portion, which in turn is followed by the xe2x80x9cDATAxe2x80x9d portion of the record.
Although this format gives the system and user some flexibility and freedom in the usage of the disk drive, this flexibility forces the user to use more complicated computer programs for handling and searching data on the disk. Since the disk drive track has no physical position indicator, the disk drive controller has no idea of the data which is positioned under the read/write head at any given instant in time. Thus, before data can be read from or written to the disk drive, a search for the record must be performed by sequentially reading all the record ID""s contained in the count field of all the records on a track until a match is found. In such a search, each record is sequentially searched until a matching ID is found. Even if cache memory is used, all the records to be searched must first be read into the cache before being searched. Since searching for the record takes much longer than actual data transfer, the disk storage system spends a tremendous amount of time searching for data which drastically reduces system performance.
Disk drives employing what is known as a Fixed Block Architecture (FBA) are widely available in small, high capacity packages. These drives, by virtue of their architecture, tend to be of higher performance than drives employing a CKD format. Such FBA drives are available, for example, from Fujitsu as 5.25xe2x80x3 drives with 1 gigabyte or greater capacity.
The distinct advantage of utilizing many small disk drives is the ability to form a disk array. Thus a large storage capacity can be provided in a reduced amount of space, and storage redundancy can be provided in a cost effective manner. A serious problem arises, however, when trying to do a xe2x80x9csimplexe2x80x9d conversion of data from CKD formatted disks to FBA disks. Two schemes for such a conversion have been considered which do not provide an acceptable solution to the conversion problem. The first of such schemes involves placing every field i.e. Count, Key and Data, of the CKD formatted record into a separate block on the FBA disk drive. Although this scheme does not waste valuable disk space when CKD formatted records contain large amounts of data, the xe2x80x9cCountxe2x80x9d field which is very short (8 bytes) occupies an entire block which is typically at least 512 bytes. For example, a CKD formatted record containing 47K bytes of data could be converted to 95 blocks of FBA disk, 512 bytes in length. In such a conversion, one block would be used to store the count of the record while 94 blocks (47K bytes length of data divided by 512 bytes of FBA disk block) would be used to store data, for a total of 95 blocks. However, search time for finding the desired record is still a problem since all the records must be sequentially searched.
For records having very short data lengths such as eight bytes, however, one full track, or 94 CKD formatted data records would need 188 blocks on the FBA disk: 94 blocks for the count portion of the records and 94 blocks for the data portion of the records, even though each data record may only occupy 8 bytes of a 512 byte FBA block. Such a scheme may thus waste nearly 50% of the disk space on an FBA disk drive.
The second scheme for converting data from CKD to FBA drives involves starting each CKD record in a separate block and then writing the complete record in sequential blocks. Utilizing such a scheme, the first FBA block will contain the xe2x80x9ccountxe2x80x9d portion of the record as well as the optional key portion and the start of the data portion of the record. This scheme, however, produces serious system performance degradation when data must be written to the disk, since before writing data to the disk, the entire record must first be read into memory, modified, and subsequently written back to the disk drive. Such a loss in system performance is generally unacceptable.
In accordance with a first aspect, the invention provides a data storage system including data storage, random-access memory, and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one data record of a logical track of data records. The data records of the logical track include variable-length data records. The data processor is programmed to access a record locator data structure in the random-access memory for locating a data storage area in the data storage allocated for storing data of the data record. The record locator data structure contains respective entries for data records having data stored in the data storage. At least some but not all of the entries have a data-length portion indicating data length of the data record of the entry when the data length of the record of the entry fails to match an expected data length.
In accordance with another aspect, the invention provides a data storage system including data storage, random-access memory, and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one count-key-data (CKD) record of a logical track of CKD records. The data processor is programmed to access a record locator data structure in the random-access memory for locating a data storage area in the data storage allocated for storing data of the CKD record. The record locator data structure contains entries for the CKD records having data stored in the data storage. At least some but not all of the entries have a respective record modifier portion when the count of the CKD record of the entry fails to match an expected count.
In accordance with yet another aspect, the invention provides a data storage system including data storage, random-access memory, and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one data record of a logical track of data records. The data processor is programmed to access a record locator data structure in the random-access memory for locating a data storage area in the data storage allocated for storing data of the data record. The record locator data structure contains entries for data records having data stored in the data storage. Each entry includes a fixed-length portion, and at least some but not all of the entries have respective variable-length portions, wherein the fixed-length portion of each entry includes an indication of whether or not the entry includes a variable-length portion.
In accordance with still another aspect, the invention provides a data storage system including data storage, random-access memory, and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one data record of a logical track of data records. The data processor is programmed to access a record locator data structure in the random-access memory for locating a data storage area in the data storage allocated for storing data of the data record. The record locator data structure contains respective entries for data records having data stored in the data storage. Each entry includes a fixed-length portion, and at least some but not all of the entries have respective variable-length portions. The fixed-length portions are stored sequentially in a first region of address locations of the random-access memory, and the variable-length portions are stored sequentially in a second region of address locations of the random-access memory.
In accordance with another aspect, the invention provides a method of maintaining a record locator data structure in random access memory of a data storage system. The data storage system also has data storage and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one data record of a logical track of data records. The data records of the logical track include variable-length data records. The data processor is programmed to access the record locator data structure in the random-access memory for locating a data storage area in the data storage allocated for storing data of the data record. The record locator data structure contains respective entries for data records having data stored in the data storage. At least some but not all of the entries have a data-length portion indicating data length of the data record of the entry. The method includes updating the record locator data structure to include a new entry for a new data record by establishing an expected data length for the new data record, and comparing the data length of the new data record to the expected data length for the new data record. If the data length of the new data record does not match the expected data length of the new data record, then a data-length portion is stored indicating the data length of the new data record of the new entry. If the data length of the new data record matches the expected data length of the new data record, then such a data-length portion is not stored.
In accordance with yet another aspect, the invention provides a method of maintaining a record locator data structure in random access memory of a data storage system. The data storage system also has data storage and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one count-key-data (CKD) record of a logical track of CKD records. The data processor is programmed to access the record locator data structure in the random-access memory for locating a data storage area in the data storage allocated for storing data of the CKD record. The record locator data structure contains entries for the CKD records having data stored in the data storage. At least some but not all of the entries have a respective record modifier portion. The method includes updating the record locator data structure to include a new entry for a new CKD record by establishing an expected count for the new CKD record, and comparing the count of the new CKD record to the expected count for the new CKD record. If the count of the new CKD record does not match the expected count of the new CKD record, then a record modifier portion is stored indicating the count of the new CKD record. If the count of the new CKD record matches the expected count of the new CKD record, then such a record modifier portion is not stored.
In accordance with still another aspect, the invention provides a method of maintaining a record locator data structure in random access memory of a data storage system. The data storage system also has data storage and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one data record of a logical track of data records. The data processor is programmed to access the record locator data structure for locating a data storage area in the data storage allocated for storing data of the data record. The record locator data structure contains entries for data records having data stored in the data storage. Each entry includes a fixed-length portion, and at least some but not all of the entries have respective variable-length portions. The method includes updating the record locator data structure to include a new entry for a new data record by determining whether or not to include a variable-length portion in the new entry for the new data record, and setting a value in the fixed-length portion of the new entry indicating whether or not the new entry includes a variable-length portion.
In accordance with yet still another aspect, the invention provides a method of maintaining a record locator data structure in random access memory of a data storage system. The data storage system also has data storage and at least one processor responsive to data storage access requests for access to data stored in the data storage. Each data storage access request specifies at least one data record of a logical track of data records. The data processor is programmed to access the record locator data structure for locating a data storage area in the data storage allocated for storing data of the data record. The record locator data structure contains respective entries for data records having data stored in the data storage. Each entry includes a fixed-length portion, and at least some but not all of the entries have respective variable-length portions. The method includes updating the record locator data structure to include a new entry for a new data record by determining whether or not the new entry is to have a variable-length portion, and storing a fixed-length portion for the new data record in a next address location in a first region of address locations of the random-access memory. If the new entry is to have a variable-length portion, then the variable-length portion of the new entry is stored in a next address location in a second region of address locations of the random-access memory.