Technical Field
This application generally relates to data storage, and more particularly to techniques used in connection with data prefetching operations in a data storage system.
Description of Related Art
Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as those included in the data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more host processors and provide storage services to each host processor. Multiple data storage systems from one or more different vendors may be connected and may provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using the data storage system. For example, a host processor may perform basic system I/O operations in connection with data requests, such as data read and write operations.
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units, logical devices, or logical volumes (LVs). The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data stored therein.
A data storage system may perform data prefetching operations. Data prefetching relates to obtaining data from a device prior to receiving an actual request for the data, such as a request from a host. Data prefetching techniques try to identify or recognize a pattern of I/O requests in a stream in order to try and predict what data will be requested next and prefetch data based on such prediction. One pattern is a sequential I/O stream. Data prefetching techniques may observe received I/O requests to try and identify a sequential I/O stream. A sequential I/O stream may be characterized as a sequence of I/O requests accessing data sequentially from the requester's point of view. A sequential I/O stream involves operating on one data portion, such as a track, immediately after the preceding one or more tracks of data in the stream. By identifying a usage pattern which is a sequential stream in connection with issued I/O requests, data prefetching techniques try and predict what data will be requested next and, accordingly, prefetch the data. For example, a data prefetching technique may observe a number of recently received I/O requests to try and identify a sequential I/O stream. If such a sequence is identified, the data prefetching technique may then obtain the next one or more data portions which are expected in the sequence prior to the data portions actually being requested.
Existing data prefetching implementations may have a problem recognizing sequential I/O streams due to the complexity of data storage configuration with multiple layers of logical device mappings in the data storage system, RAID striping, and the like. Not all of the information needed to recognize a sequential I/O stream may be available to the component in the data storage system performing the recognition and associated prefetch processing. In a data storage system such as by EMC Corporation, a backend disk adapter (DA) or director as included in a disk controller may read and write data to the physical devices. The DA may implement the data prefetching technique and perform processing to recognize a sequential I/O stream. The DA may only have access to information regarding the LV to physical device mappings and may otherwise not have access to information regarding other logical mappings and logical entities, as defined on the data storage system, which may be referenced in a host I/O request. As such, the DA may not be able to properly recognize a sequential I/O stream from the requester's (e.g., host's) point of view in order to trigger any appropriate prefetch processing. As an example, a data storage system may define a metavolume which consists of multiple LVs. The metavolume appears to the host as a single logical device that may be used in connection with the host's I/O requests. A host may issue I/O requests to consecutive tracks of data on the metavolume in which the consecutive tracks span two LVs. The foregoing may be a sequential I/O stream when evaluated across the two LVs in the context of the metavolume. However, the DA may not have knowledge regarding the metavolume and, thus, not recognize the foregoing sequential stream to trigger any appropriate prefetch processing.
Existing data prefetching techniques may also include inefficiencies. For example, in one existing implementation, the DA may maintain a list of information with an entry in the list for each I/O task the DA is servicing. In connection with determining whether to prefetch additional data subsequent to an initial prefetch, the DA may continuously evaluate each entry on the list to determine whether to perform additional prefetching for the associated task. Such polling of the list may be time consuming and reduce the amount of time and data storage system resources available to perform data prefetching.
As such, it may be desirable to utilize an efficient data prefetching technique with improved recognition of sequential patterns of I/O requests.