The rapid growth of information service and data processing industries has resulted in a need for data storage systems to efficiently manage and store large amounts of data. Typically, a data storage system that serves this need includes a plurality of high-capacity high-performance disk drives along with some processing circuitry that can store data to and retrieve data from the disk drives. To achieve high-performance when either storing or retrieving data on behalf of computer systems coupled to the data storage system, some data storage systems include an internal cache memory system that serves as an electronic buffer for data being transferred into (i.e. written) or out of (i.e., read from) the disk drives that operate within the data storage system. Since disk drives are highly mechanical in nature, they provide relatively slow access to data as compared to a cache memory system that is fully electronic. As such, a data storage system can use an internal cache memory system as a high-speed storage area for data for which host computer systems require access.
Certain conventional data storage systems are equipped with logic or processing circuitry that attempts to efficiently utilize an internal memory system such as a cache according to various memory management techniques. Efficient use of a cache memory system is generally preferable because a typical cache memory system is limited in size and is often shared for use by all host computer systems that request access to data within the data storage system. For example, if two host computer systems are coupled to the data storage system and are attempting to concurrently read respective streams of data from the data storage system, the data storage system may use a memory management technique to buffer portions of the stream of data into the cache. One conventional memory management technique that attempts to optimally utilize an internal cache memory system in this manner is called “predictive caching.”
Predictive caching can detect a sequence of sequential or related access requests to sequentially located data locations within the data storage system. Upon detecting such a sequence, a predictive caching process in the data storage system can read ahead from a current physical storage location (e.g., from a most recent read operation) and can fill up a portion of the cache memory system with data that the predictive caching process “predicts” will be needed by future read requests. In this manner, predictive caching attempts to use data access pattern recognition techniques such as detecting multiple sequential reads from successively incremental physical disk locations to predict what data will be requested in the future. Based on this prediction, the predictive caching process can then access and cache data before the data storage system actually receives access requests for data.
To illustrate a more specific use of predictive caching, consider a scenario in which a host computer system is performing a sequential file access operation, such as might occur during a backup operation of a large file stored within the data storage system onto a tape. The sequential file access operation operates by reading the entire contents of the file byte by byte from physical data storage locations, starting at the beginning of the file and continuing to the end of the file. Without predictive caching, a data storage system would handle each read operation within the sequential file access operation as a separate and relatively slow access to data within the disk drives in the data storage system. However, a data storage system that uses predictive caching can detect the sequential pattern of data accesses and can read ahead from the current location in the sequence to place data into the cache memory system. Then, for subsequent read operations that occur during the sequential file access operation, the data storage system can attempt to service those read operations faster by using data from the cache memory system instead of having to service each read operation using disk drive processing circuitry that tends to be slower. By detecting access to sequential locations in the disk drives and reading data into the cache memory system before that data is requested, and by servicing future data access requests from the cache using that data, performance of the data storage system using predictive caching is somewhat increased.
Other conventional memory management techniques for use in a data storage system exist as well. For instance, some data storage systems, such as certain data storage systems of the Symmetrix line manufactured by EMC corporation of Hopkinton, Mass., include a cache control software interface which allows an application on a host computer system to externally control caching of data within the internal cache of the data storage system. For example, an application that routinely uses a specific file can include embedded calls to functions that are available within a cache control software interface which can instruct the data storage system to permanently cache the file within its internal cache memory system. In other words, specific cache control commands can be included within the software code of an application program. These commands use a cache control software interface provided by the data storage system to allow the application to specifically control how the data storage system is to perform memory management techniques on behalf of read and write requests (i.e., data access requests) from that application. After operating or executing a cache control call using such a software interface, as the application executes on the host computer system and requests access to the file from the data storage system, the data storage system can provide such access from its internal cache memory system instead of having to service such access requests via access to the disk drives in the data storage system.