Host processor systems may store and retrieve data using one or more data storage systems containing a plurality of host interface units (host adapters), disk data storage devices, and disk interface units (disk adapters), as well as a cache memory. Such data storage systems are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systems access the data storage systems through a plurality of channels provided therewith. Host systems provide data and access control information through the channels of the data storage systems and the data storage systems provide data to the host systems also through the channels.
The host systems do not address the disk data storage devices of the data storage system directly, but rather, access what appears to the host systems as a plurality of logical volumes. These logical locations are mapped into physical locations on the disk data storage devices, but the logical volumes may be larger or smaller than the corresponding disk data storage devices, and may span multiple drives. A single logical location may also be mapped to multiple physical locations, when, for example, data mirroring is desired.
Cache memory may be used to store frequently accessed data for rapid access. Typically, it is time-consuming to read or compute data stored in the disk data storage devices. However, once data is stored in the cache memory, future use can be made by accessing the cached copy rather than reading it from the disk data storage device, so that average access time to data may be made lower.
One technique for expediting read requests involves prefetching data units so that more data units will available from cache memory rather than from disk storage. Typically, prefetching is implemented by reading data units in blocks in response to one or more requests to read a data unit. Since a request to read a specific data unit increases the likelihood that access to other, related data units will soon be required, the read request for the data unit may trigger a prefetch request to read related data units as well, particularly when a read request results in reading a data unit off-cache rather than from the cache memory.
Prefetching requires a significant number of cache-slots to be available in the cache memory. When long sequences of data units are prefetched into the cache memory, other data units typically have to be removed in the cache memory in order to make room for the newly prefetched data units.
One problem with prefetching is that the data units that are prefetched are not necessarily going to be accessed, for example by a host processor. A possibility arises that the host processor will access them because they are adjacent to a data unit that it had required, but it is not a certainty that the host processor will require the prefetched data units.
Prefetching involves retrieving data units that the host may or may not need. On the other hand, prefetching involves removing in-cache data units that still have some probability of being accessed. Therefore, prefetching raises the possibility that data units for which the host processor requires access may be replaced by data units for which the host processor does not and never will require access. It is therefore, important to remove cache data that is not likely to be still required by the data storage system. Cache Pollution is defined to be the population of the cache memory with data units that are not required for re-accessing, for example, by a host processor.
As noted before, a read request for data units that are out-of-cache will take longer to execute than a request for data units that are in-cache. Therefore, it is not preferable to retrieve the data unit e from its location off-cache if it can be read from an in-cache location. In addition, procedurally, a disk adapter will execute a read request before it completes a prefetch operation. Therefore, the disk adapter will execute the read request for the data unit e before it completes the prefetch operation in which the data unit e would have been retrieved.