1. Field of the Invention
This invention generally relates to improvements in data storage devices for the input and output of information to a data processing system and, more particularly, to a fast write function for a data storage device. Generally, all data in write operations goes directly to data storage devices as well as cache in order to keep all data current on a nonvolatile media. Using the fast write method, the data may be transferred to a nonvolatile storage without a concurrent update of the data storage device. In the fast write mode, the data is written to the cache and the nonvolatile storage. The update of the actual data storage device transpires later. A status array is also employed to retain the status and device identification information on a status track of each of the data storage devices and another location to provide global identification and management of interchangeable data storage devices. A journal log is also used to provide recovery capability in the event of system failure.
2. Description of the Prior Art
The past several years have seen the growth of on-line workstations, the evolution of distributed processing, and the acceleration of information technology in many new application areas. The result has been an increase in the rate of access and the use of on-line database systems and a growth in the requirement for storage capacity and increased reliability and flexibility in data storage devices.
To satisfy the performance demands of on-line systems, the main memory of a central processing unit (CPU) and the data storage devices (DASD) are supplemented with a directorymanaged, high speed buffer storage that is continually updated to contain recently accessed contents of the main memory of the CPU. The purpose of the cache is to reduce the access time associated with obtaining information from slower speed DASD by having the information in the high speed cache.
The performance of a cache is characterized by hit/miss ratios. A hit occurs when a READ request from the CPU finds the requested data in cache as contrasted with a miss which means the data is not in cache and must be read from DASD. A hit with respect to a WRITE request from the CPU occurs when the information can be written into a free location in the cache for later transfer to the DASD. If there is no additional space available in the cache then a WRITE miss occurs and data must be written to DASD in order to accommodate the new information.
The process of writing information from cache to DASD is called destaging, and the process of reading information from DASD to cache is called staging. Data destaged or staged between the cache and the DASD is managed by algorithms designed to keep the data most likely to be referenced next by the CPU in the cache. Two of the more popular algorithms that are used for this management are the least recently used (LRU) and most recently used (MRU) algorithms. The LRU algorithm is used to determine which information in cache has been used the least and is a good candidate for destaging. The MRU algorithm is used to determine the information that is used the most and is a good candidate to be staged.
One of the problems with the staged storage system described above is that changes that are written to the cache are not written immediately to DASD. A problem arises if a failure occurs between the cache and the DASD. This means that updates that reside in the cache cannot be posted to the DASD. The prior art has approached this integrity problem in various ways exemplified by Hoff, "Selective Journaling", IBM Technical Disclosure Bulletin, Vol. 18, pp. 61-2, June 1975; and Baird and Ouchi, "Synchronous Pair Block Recording", IBM Technical Disclosure Bulletin, Vol. 25, pp. 2053-6.
Baird's staged storage protection method produces an endless history file such as that required for auditing. The method employs an active file and a history file. The history file is updated each time a write is made to the active file to track the changes that are made. The disadvantage of this approach is that CPU processing is delayed until the actual write to the active and history file is accomplished on DASD. Hoff describes a journaling technique that tracks critical data files and keeps a duplicate copy of these files. The technique destages the journal information to tape at regular intervals to allow the journal area to be reused.
Another approach to assuring the integrity of information in staged storage systems is disclosed in U.S. Pat. No. 4,084,231 to Capozzi, which provides a hierarchal memory system consisting of multiple levels. The highest level is the CPU's main memory, and the lowest level is a tape unit for tracking changes to the memory system. A least recently first modified (LRFM) algorithm is employed to destage the information that has been in memory for the longest amount of time. This is in sharp contrast to the more efficient LRU algorithm that destages the information that has been least used by the CPU. One of the drawbacks to this system is the flat file nature of the journals. The journals have no capability to be reused unless the complete journal is archived to a medium such as tape. This means that frequent archiving must transpire or else a large amount of storage must be dedicated to the journal.
An additional approach to assuring the integrity of information in staged storage systems is disclosed in U.S. Pat. No. 4,507,751 to Gawlick, which teaches a journaling technique for a staged storage system which utilizes a buffer and a journal stored on a nonvolatile medium Gawlick's journaling method uses a first-in-first-out (FIFO) stack approach to updates. This updating approach allows the use of a finite journal; however, it does not provide the necessary support for a random update cache journal.
A final approach to assuring the integrity of information in staged storage systems is disclosed in U.S. Pat. No. 4,077,059 to Cordi et al. Cordi discloses a hierarchal memory system comprising multiple levels. Each level has a data store, a copy back store and a journal. When data changes are made at a memory hierarchy level, they are recorded in both the data store and the copy back store of that memory hierarchy level, and a corresponding entry is made in the journal to track the update. The data changes are copied from the copy store to the data store of the next lower level in the memory hierarchy at the appropriate times using an LRU algorithm. The journal entries are made in sequential locations of the journal in a FIFO fashion. The journal of a hierarchy is used to control the order that the changes are to be copied to the lower level in the memory hierarchy. The journal is not used for backup purposes to protect the integrity of the information.
The concepts of self-test, redundancy, cross-check verification between various sources of information and the like are also well known in the art. Particularly, with the advent of complex digital computer systems used for applications such as process control or supervisory control. An example of such a system is illustrated in U.S. Pat. No. 4,032,757 to Eccles, which uses a pair of channels to continually compare the events occurring in each computer. The constant comparison allows the additional processor to quickly take over control of the process if the other processor fails. The problem with this approach is the time that the extra processor needs to begin processing after a failure. In critical processes such as a nuclear power plant, any time lag could be unacceptable. Another approach is presented in U.S. Pat. No. 4,270,168 to Murphy et al., which uses a plurality of processors, self checks and joint answer checking to assure that each processor can assume real time utilization for another processor in the event of a failure. The increased reliability presented in these systems is a memory resident application that fails to address a large data base application spanning many data storage devices.
The general configuration of a data processing system. typically comprises a host processor or processors, a memory and various peripheral units. The peripheral units include terminals, printers, communications devices and DASD. We are concerned with the control that provides information from DASD to a data base application residing in the host processor memory. Further, customers have come to expect an increase in performance to accompany an increase in reliability. A good example of prior art approaches to this type of processing is presented in U.S. Pat. Nos. 3,999,163 to Levy et al., 4,067,059 to Derchak and 4,189,769 to Cook et al. These Pat. Nos. present various ways to enable a host to process information residing on DASD. While these patents describe production systems that readily lend themselves to database applications, they are lacking the capability of retaining status information when a power-off occurs in an environment designed to provide high availability of DASD information.
In a known data processing system, a memory control circuit connected to each of a plurality of memory banks selectively transfers the contents of an arbitrary first of the memory banks to a selected second of the memory banks in a manner whereby, if a first memory bank is written, a circuit transfers the contents into a second memory bank thereby preventing a loss of information. An example of such a system is illustrated in U.S. Pat. No. 3,866,182 to Yamada et al.
Various approaches to the efficient utilization of data storage devices through the usage of a cache are known, such as U.S. Pat. Nos. 4,504,902 to Gallaher et al., 3,938,097 to Niguette, III, 4,506,323 to Pusic et al., 4,530,054 to Hamstra et al., and 4,523,275 to Swenson et al. These approaches share the inability to perform logical write operations without updating the physical data storage device.
U.S. Pat. No. 4,574,346 to Hartung, one of the co-inventors in this application, discusses an enhanced control of data usage in a cached data storage system having a volatile cache and a backing store. Data is moved to the backing store as it is not needed. Data that is already in cache when a read request arrives is immediately serviced without data storage device access; however, write requests must still be serviced by accessing the physical data storage device.