1. Field of the Invention
This invention relates generally to memory subsystems for the storage of data in data processing systems. More particularly, the invention relates to an improved system and method for detecting sequentiality in memory accesses.
2. Related Art
Data caching is a well known technique for eliminating delays in memory access due to mechanical limitations of a storage device. For example, in the case of a disk drive, plural disks rotate at a fixed speed past read/write heads which may either be stationary with respect to the disk or move radially back and forth with respect to the disk in order to juxtapose the heads to various portions of the disk surfaces. In either case, there is a finite average time (access time) required for a particular data record to be located and read from the disk. This "access" time includes the time for a head to move to the correct cylinder (seek time) and the time required (or latency) for the disk to rotate with respect to the head until the beginning of the particular record sought is juxtaposed to the head for reading and writing.
Cache data storage eliminates these inherent delays by storing records in frequently accessed tracks in a high speed system memory (e.g., solid-state RAM). The idea is simply to allow as many memory accesses as possible to immediately retrieve data from the high speed system memory rather than wait for the data to be transferred (or staged) from the slower disk storage device to the high speed system memory. To accomplish this task, data may be staged into the high speed system memory before data access is required (i.e., prestaged).
Clearly, the effectiveness of the cache data storage system is limited by the system's ability to anticipate the needs of future memory accesses and transfer those data records from disk storage to the high speed system memory prior to the memory access. These general considerations are described in Houtemaker et al., "MVS I/O Subsystems: Configuration Management and Performance Analysis", McGraw-Hill, 1993, which is hereby incorporated by reference in its entirety.
If a sequence of memory accesses is random in nature, the cache data storage system cannot anticipate future memory accesses. Accordingly, one method of anticipating future memory accesses is to identify sequential or near sequential memory accesses. Once a sequential or near sequential access is identified, future records/tracks in the sequence can be immediately prestaged into the high speed system memory in advance of future memory accesses.
To identify sequential access, two methods are generally used. In one method, the system utilizes "hints" provided by the host operating system to go into sequential mode. Examples of disk storage systems which rely on hints from the host operating system include the Storage Technology Corporation Iceberg RAID data storage subsystem and the International Business Machines Corporation Model 3390 data storage subsystem.
Other systems, however, promote software transparency by detecting sequential access independently of the contents of either the memory access request or the data record itself. In other words, no distinguishing mark is evident in either the memory access request or the data record to identify sequentiality. In these systems, the cache data storage system must provide some internal means for effectively distinguishing between sequential and randomly accessed data records. An example of this type of system is the Symmetrix subsystem available from EMC Corporation which relies on the accessing of multiple tracks sequentially before sequentiality is identified.
Generally, any internal sequential detection system involves design tradeoffs that include variations in cache memory size, processing overhead required to administer the sequential detect algorithm, number of tracks staged upon detection of sequentiality, etc. For example, the limited size of cache memory directly influences the number of tracks that may be speculatively staged into the high speed system memory. This follows since, generally, the cache memory is a costly resource that must be managed properly to increase system performance. As a further consequence, the limited size of cache memory increases the number of acceptable "misses" prior to finding sequentiality. In this context, a "miss" is an access to a record that has not been prestaged into the high speed system memory. Clearly, prestaging of tracks should not occur unless there exists a sufficient probability that those staged tracks will be the subject of future memory accesses. Finally, the overhead required to administer the sequential detection system must not become overly burdensome wherein the performance of the cache data storage system itself is compromised.
Thus, what is needed is a low overhead, accurate sequential detection system and method that minimizes the number of cache "misses".