Taking data storage to tape as an example, a host computer system typically writes data to a storage apparatus, such as a tape drive, on a per Record basis. Further, the host computer may separate the Records themselves using Record separators such as FILE MARKs or SET MARKs.
Typically, Records comprise user data, for example, the data which makes up wordprocessor documents, computer graphics pictures or data bases. In contrast, Record separators, such as FILE MARKs, are used by a host computer to indicate the end of one wordprocessor document and the beginning of the next. In other words, Record separators typically separate groups of related Records.
Generally, the host computer determines Record length, and the order in which the Records and the Record separators are received, and, typically, the storage apparatus has no control over this.
By way of example, the diagram in FIG. 1(a) illustrates a logical sequence of user data and separators that an existing type of host computer might write to a tape storage apparatus. Specifically, the host computer supplies five fixed-length Records, R1 to R5, in addition to three FILE MARKs, which occur after R1, R2 and R5.
It is known for a storage apparatus such as a tape drive to receive host computer data, arrange the data Records into fixed-sized groups independently of the Record structure, and represent the Record structure, in terms of Record and FILE MARK position, in an index forming part of each group. Such a scheme forms the basis of the DDS (Digital Date Storage) data format standard for tape drives defined in ISO/IEC Standard 10777:1991 E. EP 0 324 542 describes one example of a DDS tape drive, which implements this scheme. Once the groups of data are formed, the tape drive stores the groups to tape, typically after applying some form of error detection/correction coding.
The diagram in FIG. 1(b) illustrates the organisation into DDS groups of the host computer data shown in FIG. 1(a). Typically, the host computer data Records are encoded or compressed to form a continuous encoded data stream in each group. FILE MARKs are intercepted by the tape drive, and information that describes the occurrence and position of the FILE MARKs in the encoded data stream is generated by the tape drive and stored in the index of the respective group. In the present example, Records R1, R2 and a part of Record R3 are compressed into an encoded data stream and are stored in the first group, and information specifying the existence and position in the encoded data stream of the records and the first and second FILE MARKs is stored in the index of the first group. Then, the remainder of Record R3, and Records R4 and R5, are compressed into a continuous encoded data stream and are stored in the second group, and information specifying the existence and position in the encoded data stream of the records and the third FILE MARK is stored in the index of the second group.
FIG. 2 illustrates very generally the form of the indexes for both groups shown in FIG. 1(b). As shown, each index comprises two main data structures, namely a block access table (BAT) and a group information table (GIT). The number of entries in the BAT is stored in a BAT entry field in the GIT. The GIT also contains various counts, such as a FILE MARK count (FMC) which is the number of FMs written since the beginning of Recording (BOR) mark, including any contained in the current group, and Record count (RC), which is the number of Records written since the beginning of Recording (BOR) mark, including any contained in the current group. The values for the entries in this simple example are shown in parentheses. The GIT may contain other information such as the respective numbers of FILE MARKs and Records which occur in the current group only.
The BAT describes, by way of a series of entries, the `structure` of a group in terms of the logical segmentation of the Record data held in the group and the position of each separator mark. The access entries in the BAT follow in the order of the contents of the group, and the BAT itself grows from the end of the group inwardly to meet the encoded data stream of the Record data.
In such a scheme, a tape drive reading the stored data, on the basis of a command from a host computer to read or write data, relies on information in the index to locate the particular Record or FILE MARK starting position in the encoded data stream.