Modern data processing systems typically comprise a host computer, consisting of an arithmetic and logic unit and a main memory unit for containment of data and instructions presenting being processed, and long-term storage means for storage of data and processing instructions at other times. In systems using the IBM Corporation's equipment, the long-term storage means is connected to the host computer by means of a "channel." When the host desires a particular data set or record, it issues a command over the channel to the long-term storage means, which then locates and reads the data from whatever medium it is stored upon, e.g., magnetic disks or tape memory media, over the channel into the main memory of the host. The substantial length of time required to retrieve data from long term storage limits the throughput or usage of the host computer. In particular, location of the beginning of the data set, e.g., physical juxtaposition of the location of the beginning of a record stored on disk to the read/write head, is time consuming. The actual reading of the data proceeds comparatively quickly. To minimize this loss of use of the host computer, the host will typically issue a series of requests for data and then perform other tasks while the data is being retrieved from long term disk or tape memory. However, even when this "queueing" is performed there is substantial host computer computation time lost due to the time required for accessing data and software overhead associated with the queueing process. This has remained an unsolved problem in the art and it is an object of the present invention to improve host computer throughput by reducing queueing times.
It has been proposed in the prior art that such queueing times be reduced by "staging" data physically stored surrounding all data which is the object of a SEEK command issued by a host, from a disk memory into a solid-state memory of much faster access speed. The solid-state memory is located external to the host, outboard of the channel from the host. Thus, when the host issues subsequent READ commands, the data sought may already be contained in the high speed solid-state memory and can be supplied to the host more or less instantaneously. However, if all data sets surrounding records accessed by the host are read into a solid-state memory external to the host as described above, the problem of queueing is not entirely eliminated, as then the channel and director usage time consumed while data is read into cache memory is added to the actual latency time required for the data set to be located on the disk and juxtaposed to the head.
Moreover, it will be appreciated that there are generally two ways in which data is accessed by a host computer. All the data in a given data set may be called for by the host at a given time, or the host may initiate a separate call for each portion of the data set as required. In the first case, addition of the cache memory to the system adds no performance improvement, as but a single latency time is required to satisfy each input/output request. In the second case, wherein each individual host instruction is part of a sequence of instructions typically directed to access successive portions of a physical record such as a tape or disk drive, latency time is consumed in responding to each portion of the data set. In this situation, the total latency time can be reduced to that of a single access operation if successive portions of the data set are read into a high speed solid-state cache. Subsequent requests for other portions of the data set can then be satisfied directly from solid-state memory without involving second and successive physical access operations. That is, if the data is cached in anticipation of a subsequent SEEK command, it will be available immediately. Accordingly, it is desirable that means be provided for determining which data requests made by a host computer are likely to be part of a sequence of such requests.
It would not, of course, be impossible for the host computer to issue a single indicating whether or not a particular data set called for is part of a sequence of such sets, and some systems now being announced will have this feature. This would, of course, simplify the decision as to whether or not to "stage" the subsequent record from the long-term data storage means into a cache memory. However, many existing computing systems of commercial importance (such as most of the IBM Corporation's line of computers) do not provide such a signal. Nor is it desirable to modify these computers, in particular their operating systems, in order to provide such a signal as such modifications are difficult to implement correctly and are not popular with computer users.
Accordingly, it is desirable to render the caching of data function more efficient by using improved means and methods to determine whether a particular data request made by a host computer is part of a sequence of requests directed to the same data set (in which event the subsequent portion of the data set would be cached) while data which is not amenable to efficient caching is processed in the same manner as in the prior art.
It is a further object of the invention to provide a system in which sequential portions of a data set can be cached so as to improve thoughput of a host computer system, without requiring modification to the host.
Yet another object of the invention is to provide a means and method for detecting whether or not a data record sought is part of a sequence of such records, wherein the means and method operates using information contained within the "channel program" processed by the storage detector, whereby implementation of the method of the invention is rendeted simple and relatively inexpensive.
A further object of the invention is to provide a method whereby an area in the cache assigned to a particular data set can be deallocated therefrom automatically so as to free storage space for reuse.