This invention relates to the storage of formatted data on a random access medium, and, more particularly, to the storage of data which may comprise any of multiple data storage formats, and to the storage of data which may comprise sequentially formatted data, such as that of magnetic tape, on a random access medium.
The typical primary data storage for a computer system comprises a random access magnetic disk drive with non-removable disks, and the typical backup data storage comprises a tape drive, an optical drive, or a tape or optical library, all employing removable data storage media, such as tape cartridges or optical disk cartridges. The random access of the magnetic disk drive provides quick access to data, and removable data storage media provides low cost, high reliability storage, and archivability for the data. Optical disk cartridges, once the cartridge is accessed, may provide a level of random access, albeit much slower than the high speed random access of a magnetic disk drive. Magnetic tape cartridges store data in a linear sequential format and access data by unwinding the tape from one reel and winding it onto another reel. Thus, the tape is moving linearly while the host is searching for data, and, once the beginning of a file is found, the remainder of the file is located in sequence from the beginning of the file. Additionally, tape drives may operate forwards and backwards and the host may initially find a file at the rear, and the remainder of the file is located in sequence from the rear. Extensions to a file may be located elsewhere on a tape and have to be accessed separately by the host. The records making up the file may be of variable length, but, so long as the tape must be moved linearly, the search for a specific record in the file may be conducted by reading the header of each record in the file as it is passed during the search, until the desired record is found. As an example, the IBM 3590 tape drives employ device blocks of one or more records, and each has a header identifying the device block, and may have a pointer to the next adjacent device block. The device blocks of the tape are arranged in a hard linear sequential format without employing pointers in the headers. The host manages the data, and conducts any searching and the read and write operations.
In many situations, it would be advantageous to provide the random access speed of a magnetic disk drive to information stored on magnetic tape or on optical disk. This was recognized, e.g., in U.S. Pat. No. 5,724,541, which provides subsequent random access to data initially stored in a sequential access storage medium, by copying data from the sequential access storage medium to a random access storage medium as corresponding blocks of data. Such an arrangement is essentially duplicative and has difficulties in that the established formats for the two types of data storage are very different.
U.S. Pat. No. 5,454,098 and U.S. Pat. No. 5,297,124 recognize that the formats are different. The ""098 patent describes the emulation of a sequential data storage device, e.g., a magnetic tape xe2x80x9cstreamerxe2x80x9d, on a random access device by transforming sequential access commands into random access commands to read or write a set of blocks in sequence. The patent emulates tape to xe2x80x9caccess data on a random access storage device with tape based management softwarexe2x80x9d. The ""124 patent describes a tape emulation system for a disk drive with a conversion mapping directory located on the outermost sectors of the disk. The patent emulates tape to instead use a disk drive while avoiding xe2x80x9chardware modificationsxe2x80x9d to a computer system previously having only a tape drive.
In the case of an optical disk, U.S. Pat. No. 4,775,969 describes an optical disk storage format for emulating a tape drive, having a high level directory providing a list of addresses for a plurality of embedded directories which are in close proximity to variable length records. The emulation of a tape drive is to replace a magnetic tape drive with an optical disk drive in a xe2x80x9cxe2x80x98plug-compatiblexe2x80x99 mannerxe2x80x9d.
Thus, in each instance, random access to the linearly sequential data is not a prime consideration, and is not optimized.
Further, data which is to be employed as backup or is archived must have full integrity, as must the capability to access that data, since the data is not likely to be maintained in duplicate media. U.S. Pat. No. 6,128,698 relates to use of a disk drive with a removable disk for storing archival tape-based data and recognizes the need for integrity of the data. The patent describes a tape drive emulator for the drive with the removable disk, employing a buffer for recording compressed data in plural disk drive sectors in sequence with sufficient ECC bytes to recover a complete sector. However, access to the data is assumed to be without failure. It is possible, for example, that a power off situation interrupts a write such that a header or table is damaged or not written, while the data is saved by the host operating system.
Copending, co-assigned U.S. patent application Ser. No. 09/842,030, filed Apr. 26, 2001, describes, inter alia, alternative devices, such as a magnetic disk drive, mounted in portable cartridges which are similar to removable media (tape) cartridges, and a transfer station for providing data transfer with respect to the cartridges. The cartridges may be used in automated data storage libraries, stored in similar, or the same, storage shelves as the removable tape cartridges, and fetched by the library and inserted in the transfer station. An example of an automated data storage library comprises the IBM 3590 tape library. It would be advantageous to write and read the data of the random access medium, such as a disk drive, in the same manner as the data of the removable tape cartridges, and to access the data in an efficient random access manner. Additionally, it would be advantageous to provide a means for assuring access to data on a random access medium despite loss or failure of one or more mapping indexes. Further, it would be advantageous to store and allow access to data of multiple data storage formats on a random access medium. Still further, it would be advantageous to allow the writing and reading of data on a random access medium in a context other than that of the random access medium.
An object of the present invention is to provide mapping for multiple data storage formats on a random access medium.
A further object of the present invention is to provide mapping for a data storage format in a context dissimilar to that of the underlying random access medium, so that the data may be written and read in that context using normal commands.
Another object of the present invention is to provide direct high performance access to sequentially formatted data, such as in a tape format or an optical format, stored on a random access medium.
Still another object of the present invention is to provide error recovery information for restoring access to data storable in any of multiple formats.
Disclosed are methods, data storage file systems, and random access media, which may be removable, for providing a mapping structure for storing data of any of various formats on a random access medium, and for providing a structure for storing data of a linear sequential format on a random access medium.
In one embodiment, the medium, such as a magnetic disk drive, has a plurality of equal sized logical sectors as a smallest single writeable/readable unit, and the logical sectors are sequentially numbered.
A third level construct is recorded comprising at least one region for writing and reading data in one of the various formats, each region having an identification of the region in terms of specific format of the data. A second level construct is recorded comprising a global device block map having at least one global device block element for each region. Each global device block element identifies bounds of the data recorded in the region in terms of the sequentially numbered logical sectors. A first level construct is recorded comprising at least one format identifier having a pointer indicating the location and size of the second level construct in terms of the sequentially numbered logical sectors. Thus, random access is conducted efficiently to the defined regions, while the format of the data may be employed to write and read the data, without requiring extensive conversion, and the host system may employ commands normal to the expected format of the data. Further, the multiple levels provide alternate paths to the access information so that any damaged or non-written tables or headers may be repaired or recovered.
In another embodiment directed specifically to data in the linear sequential format, a construct is recorded comprising at least one region for writing and reading data in the linear sequential format. The data of the linear sequential format is organized in a stream of sequential device blocks of variable lengths for writing and reading. The region construct provides a pointer to each device block in the stream. Additionally, each device block has a device block header with a plurality of backwards references, each referencing a separate previous device block in the stream, one of the references to an immediately adjacent previous device block, at least one of the references to a closely adjacent previous device block, and at least one of the references to a distant previous device block. Both the region and device block pointers are in terms of device block sequence numbers, and in terms of the sequentially numbered logical sectors. Thus, direct random access may be made to the area of a desired device block, despite the variable lengths of the device blocks, thereby minimizing the number of operations.
For a fuller understanding of the present invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings.