1. Field of the Invention
This invention relates to computers and computer complexes, and operating systems for controlling them. More particularly, this invention describes techniques for improved management of cached data which is sequentially accessed, and shared.
2. Background Art
Improving performance by caching data in high speed memory is a common strategy used in many computer systems. In managing caches, two common techniques are page replacement algorithms and prefetching algorithms. Page replacement algorithms are used to eliminate data that is unlikely to be used in favor of data that is more likely to be used in the near future. Prefetching algorithms are used to bring data into the cache when it is likely to be used in the near future.
The Least Recently Used (LRU) algorithm is the cache management page replacement algorithm used in many previous systems. This algorithm assumes that records recently accessed will soon be reaccessed. This assumption is not adequate for sequential access patterns (spatial locality) when a particular job reads a particular record only once. In this case, which is frequently found in batch processing, temporal locality within a job does not exist.
When data is accessed sequentially, it may be possible to improve performance by prefetching the data before it is needed. This strategy is common in previous systems. Prefetching means that in the event of a page fault multiple physically adjacent records are fetched together in addition to the record for which the fault occurred. Simple prefetching schemes may be ineffective since records are often unnecessarily prefetched. More sophisticated strategies use a-priori knowledge obtained by analyzing program traces, accept user advice or dynamically analyze the program reference behavior, can significantly improve performance. Prefetching can improve performance in two ways: First, the I/O (Input/Output) delay and thus response time of a job (transaction, query, etc.) can be reduced by caching data prior to the actual access. Second, the I/O overhead for fetching N physically clustered records is usually much smaller than N times the cost of bringing in one record. On the other hand, prefetching of records not actually needed increases the I/O overhead and may displace other pages which are about to be referenced.