The invention relates to mass memory apparatus of the staging data type and more particularly to error control apparatus and methods for use therein.
Direct access storage such as disk storage devices (DASD) has many advantages when used in a data processing system. For example, it enables rapid access to a data record as opposed to moving a record tape to scan long sequential files. It is usually on-line when one needs it. It is reliable. But such direct access storage is expensive. Also, the number of disk drives attachable to a host CPU is usually limited. It is also inefficient because the amount of data in use on a single disk device or drive at one time is usually small.
On the other hand, tape storage has many advantages when used in a data processing system. Large quantities of data can be stored in a tape library. It is reliable and relatively inexpensive. But an entire tape file, perhaps of several reels, must be read and rewritten to obtain a few records that are needed for data processing. Processing must be sequential, which requires transaction files to be sorted before updating a master file. Time can be wasted in finding the proper reel to mount. Mounting the wrong reel of a multi-reel tape file causes rerun problems. Also, maintaining a tape library can be expensive.
To enhance data processing, a staging mass storage system (MSS) combines the better features of disk storage with the economy of tape storage. The storage capacity equals that of a large tape library. Data can be processed in a tape-like sequential manner or in the efficient direct access disk manner. Most important in an operating environment, the data is available to the processing system without the delays associated with the finding of the tape reel, mounting it, and returning it to the tape library after use. Addressing such apparatus is in a so-called "virtual direct access storage" mode as described in U.S. Pat. No. 3,670,307 as implemented using the international Business Machines Corporation 3330 disk storage virtual volume addressing scheme. This scheme defines a logical address space as containing 100,000,000 data bytes--the storage capacity of one IBM 3330-type disk pack. Usage of this addressing scheme will become apparent. During the time the data is being processed, it is on-line on 3330 disk drives. When the data is not in use, it is stored on tape in a Mass Storage Facility (MSF).
A MOUNT virtual volume message given to an MSC initiates transfer of data from tape to disk. The MSC searches its tables, finds the location in the MSF where a data cartridge containing that data is stored, finds space on an available disk drive, reads the data from the data cartridge, and writes it on the disk drive.
The disk packs to which the data stored in the data cartridges is written are called "staging packs," and the process of copying the data from the data cartridge onto the disk pack is called "staging." Data must be staged before it can be processed by a host. The data needs only be staged once for multiple concurrent uses.
The process of writing the disk cylinders containing changed data back to the data cartridge is called "destaging." Since all the original data is still on the data cartridge, writing the changed data back results in the data cartridge having a complete updated data set. Data signals stored in disk storage that is not altered are never destaged.
Staging packs are divided into pages of storage. Each page consists of eight cylinders. There are 51 pages of staging space on one staging pack. When data is staged, it is written on whichever pages of space are available at the time. The data from a single data set does not necessarily go on consecutive pages of a staging pack, nor does it necessarily use only pages on a single staging disk drive.
When host computers to the MSS are IBM 370 type, the MSS responds to the program operating system OS/VS of such 370 host machines in the virtual direct access storage mode. That is, MSS locks like a lot of disk drives to the hosts. This means that the known 370 OS/VS programs for operating with the 3330 virtual volumes also operate with MSS. In this mode, OS/VS assigns a disk virtual volume to a system "unit." When a virtual volume is mounted in the MSS, it is also assigned to a unit address. Since, in MSS, a virtual volume can be as small as one page, a complete staging pack could mount 51 virtual volumes and therefore need 51 unit addresses. Because of this, the old idea of the unit address being a combination of channel, control unit, and device is modified. MSS uses a "virtual unit address" to designate the logical address of each virtual volume. Each virtual volume is assigned a virtual unit address to be used by MSS in staging data and in locating it on a staging pack. A group of virtual unit addresses is assigned to each group of real disk drives, termed "staging drive group".
In operation, these virtual unit addresses are varied on-line and off-line just like other system units and real units are varied on and off.
In an MSS destaging operation, data signals read from DASD units go through a buffer in a director 16 into the tape units DRD for a recording or write operation. The format of the data, as recorded on DASD, is imaged on the tape; that is, the data format includes count, key, and data widely used on DASD. If a write occurred on DASD, a host CPU updated the data which means that the count, key, and data are all changed. Further, control signals recorded at the beginning of a cylinder of data are probably also changed. In summary, the entire data format after a recording operation on DASD is entirely different from the data format prior to such writing operation. It is revised or newly formatted data signals which have to be accurately recorded on the tape.
Each DASD cylinder includes a plurality of record tracks, one track on a recording surface. For example, in one DASD unit, 17 tracks constitute a cylinder, all of the tracks being at the same radial position on the respective record surfaces. When transferring data from the DASD cylinder to the tape, an error may occur anywhere within the cylinder. Generally, such errors occur only on one track. At this point in time, the DASD reading operation is aborted using known procedures. As such, the signals recorded on the tape which correspond to the data signals supposedly recorded in the DASD cylinder contain partly the newly formatted data signals, plus a remainder (unknown amount) of the old formatted tape signals. Since the control signals are always recorded at the beginning of the cylinder, the control information defining the signals following the error has already been destroyed; i.e., the data which has been destaged for the DASD read error has overwritten the old formatted data. Hence, on tape at the onset of a DASD read error, the tape has a portion of the newly formatted data plus an unknown portion of the old data which has had its control information completely obliterated. It is extremely important that the destaged data signals be in one format; otherwise, all of the data recorded on the tape in that particular portion becomes substantially meaningless.
The present invention is most advantageously used with apparatus referred to above and as shown in FIG. 1. An MSS apparatus includes an MSF 10 having a tape cartridge store such as shown in Beach et al, supra. MSF 10 also includes a plurality of data recording devices (DRD) 12 (tape recorders) and associated data recording controls (DRC's) 13 (tape recorder controls) all constructed in accordance with the documents incorporated by reference. MSF 10 constitutes the data base memory portion of the MSS.
An intermediate storage level of MSS consists of a plurality of disk storage units (DASD) 14, associated DASD controllers 15, and storage controls or directors 16. Each director 16 includes a staging adapter portion for automatically moving data signals between MSF 10 and DASD 14 and 15. Moving data signals from MSF 10 to DASD 14 and 15 is termed "staging" (data promotion to a higher storage level), while moving data signals from DASD 14 and 15 to MSF 10 is termed "destaging" (data demotion to a lower storage level).
An MSC 17, a programmable computer, supervises and directs operations of MSS as will become more clear.
One programmable host computer is a so-called "primary" host 18. This computer, in a limited manner, supervises operation of MSS on behalf of all other connected host computers 19. Each host computer 19 has at least one channel connection to a storage director 16; such channel connections are in accordance with U.S. Pat. No. 3,400,372. Additionally, primary host 18 has a channel connection to MSC 17 for issuing commands and receiving MSS status signals, as will become more apparent. The MSC 17 acts as a control unit to primary host 18, all in accordance with U.S. Pat. No. 3,400,372. MSC 17 connections to MSF 10 controller 21 and to storage directors 16 are also in accordance with Patent 3,400,372, wherein MSC 17 is a "host" or "CPU" and units 16 and 17 are the control units of U.S. Pat. No. 3,400,372. Controller 21 is as described in Beach et al, supra, and Carter et al T921,023, dated Apr. 16, 1974.
As described above, a problem presented in operating a multi-level or hierarchal MSS during destaging or data demotion from the DASD upper storage level to MSF 10 lower storage level is handling and recovery from DASD read errors. Each host CPU must have an opportunity to take recovery actions before such data is destaged to MSF 10 with an error. Recovery from DASD read errors is 99.5% successful by manually moving a disk pack from one disk drive to another disk drive. That is, 99.5% of the time the second disk drive successfully reads the moved disk pack. In a virtual addressing environment during MSS operations, moving disk packs can destroy addressability---the data cannot be accessed by any host. When moving disk packs from one drive to another, the same channel address can be maintained even though the pack is on a different drive. In this manner, addressability is maintained.
In a real addressed system, storage equipment errors or checks are not readily propagated as data errors to data in other storage equipment at the same storage level. In a virtually addressed MSS, one storage unit may contain data from many diverse sources; hence, one storage unit having error conditions can result in widespread data sets with increased catastrophic effects over real addressed storage. Such a situation should have early detection and correction.