1. Field of the Invention
The present invention is directed to cached data storage systems.
2. Description of Related Art
Storage systems including storage devices such as disk drives, tape drives, etc., are used in many different types of computer or data processing systems to store data. Disk drives generally include one or more disks of a recording medium (e.g., a magnetic recording medium or an optical recording medium) on which information can be written for storage purposes, and from which stored information can be read. Large data storage systems may include on the order of one-hundred disk drives, with each disk drive including several disks. One such mass storage system is the SYMMETRIX line of disk arrays available from EMC Corporation of Hopkinton, Mass. The SYMMETRIX line of disk arrays is described in numerous publications from EMC Corporation, including the SYMMETRIX model 55XX product manual, P-N200-810-550, rev. F, February 1996.
In a data storage system, a host data processor typically is able to write data to and read data from particular storage locations in one or more of the data storage devices. To increase system performance, a cache may be interposed between the host data processor and the data storage device(s). In a cached system, when the host data processor writes data to a storage device, the data is stored temporarily in the cache before being destaged to the storage device in a manner that is asynchronous with and transparent to the host. Once the host data processor has written data to the cache, the host data processor can perform other tasks while the data storage system destages the data from the cache to the appropriate storage device(s). Because the host data processor can write data to the cache much faster than to the data storage devices, caching the data increases the data transfer efficiency of the system.
Similarly, in a cached system, when the host data processor reads data from a data storage device, it may actually read the data from the cache after the data has been transferred from the data storage device to the cache. When the host data processor requests a read from a data storage device, if the data is already in the cache, the host data processor can read the data immediately from the cache, increasing the performance of the system in performing such a read. When the data is not already in the cache, the data may first be transferred from the data storage device to the cache before the host data processor reads the data from cache.
Data commonly is stored in a data storage system in units called xe2x80x9clogical volumes,xe2x80x9d and these logical volumes typically are divided into so-called xe2x80x9clogical blocks.xe2x80x9d Accordingly, the host data processor accesses data in the storage system using a logical volume address (LVA) and a logical block address (LBA). In some intelligent storage systems, a mapping is performed between the LVA""s provided by the host and the actual physical locations where the corresponding data is stored. Thus, in such intelligent systems, the actual physical locations at which the logical blocks and logical volumes of data are stored in the data storage devices generally are not visible to the host data processor. That is, the host data processor needs only to specify LVAs and LBAs, and the data storage system controls how the logical volumes of data are mapped to and stored by the data storage devices. Each physical storage device (e.g., a disk drive) in the storage system may store a single logical volume. Alternatively, it is possible in many systems to configure each physical storage device to store two or more logical volumes, or to configure two or more storage devices to store a single logical volume.
FIG. 1. shows an exemplary prior art data storage system 101. As shown, the data storage system 101 includes data flow controllers 104a-b, data storage devices 106a-h, and a memory 102 that is globally accessible to the data flow controllers. The globally accessible memory 102 includes a cache 116 and a directory 108. Each of the data flow controllers 104a-b includes a direct memory access (DMA) machine, a bus interface device, and a processor (e.g., the DMA machine 109, the bus interface device 111, and the processor 107 shown in the data flow controller 104a). Each of the data storage devices 106a-h includes several storage locations (e.g., storage locations 110, 112, and 114 shown in the data storage device 106a). It should be understood that each data storage device 106 typically includes many more storage locations than are shown in FIG. 1. A data storage system such as that shown in FIG. 1 also typically includes many additional data storage devices and data flow controllers to permit large quantities of data to be stored by the system.
Using the exemplary storage system shown in FIG. 1, a host data processor (not shown) can write data to and read data from the data storage devices 106a-h via the cache 116 and the data flow controllers 104a-b. Using buses 103 and 105a-b, the data flow controllers 104a-b can direct the transfer of data between the cache 116 and storage locations (e.g., the storage locations 110, 112 and 114) in the data storage devices 106a-h. 
Data can be transferred between the cache 116 and the data storage devices 106a-h in units of any size. Commonly, however, data is transferred between these devices in logical blocks. A logical block may include, for example, five hundred and twelve bytes of data. Typically, the cache 116 is divided into a number of units called xe2x80x9cslotsxe2x80x9d (not shown), with each slot being divided into several sections. Each section of a slot typically will have storage space for a single logical block of data and will therefore be referred to herein as a block-sized section. Each slot may be divided into a sufficient number of sections to provide storage space for a logical track of data, which may, for example, correspond to the amount of storage space provided by a physical track of disk a drive serving as one of the data storage devices 106a-h. Each slot may, for example, be divided into one hundred and twelve block-sized sections to create storage space for a logical track of data that is one hundred and twelve logical blocks long. Each logical volume stored by the system typically is divided into several logical cylinders, with each logical cylinder being divided into several logical tracks. Each logical cylinder may, for example, correspond to a physical cylinder (described below) of a disk drive serving as one of the data storage devices 106a-h. Before a logical block of data is written to the cache 116, a slot can be dynamically assigned to represent the logical track in which the logical block of data is included, and the logical block can be written to a block-sized section of the slot corresponding to the logical block""s location within the logical track.
Each slot in the cache 116 may have a holder associated with it which contains information regarding the current contents of the slot. For example, the holder may contain information identifying: (1) the logical track that the slot is currently assigned to represent, and (2) the particular block-sized section(s) within the slot that contain logical blocks of data that have been written by the host data processor but that have not yet been destaged to one or more of the data storage devices 106a-h, i.e., those block-sized sections that currently contain logical blocks of write-pending data.
The directory 108 may contain a write-pending flag for each logical track of data stored by the system. For a write operation, after the host data processor (not shown) has transferred a logical block of data to a block-sized section of a slot of the cache 116, the write-pending flag for the logical track that includes that logical block of data can be set in the directory 108 to indicate that data for the logical track is currently stored in the cache 116 and has yet to be destaged to the data storage device 106. The processor in each data flow controller 104 (e.g., the processor 107) can periodically scan the directory 108 for write-pending flags that have been set for logical tracks that are stored by the storage devices 106 serviced by the data flow controller 104. In response to identifying a set write-pending flag for a particular logical track, the processor 107, by examining the holders of the various slots, can identify the slot(s) currently assigned to store those logical block(s) of the logical track that include write-pending data. Additionally, by examining the contents of the holder associated with the identified slot, the processor 107 can identify which block-sized sections of the slot store logical blocks of write-pending data.
Since the holder for each slot identifies the slot as storing data for a logical track including logical blocks of data (e.g., one hundred and twelve logical blocks) having sequential LBAs, each logical block of data that is written to a slot is stored in the slot according to its LBA. However, when each block of data is stored in one of the data storage devices 106a-h, it is stored according to a physical block address (PBA) which uniquely identifies the physical location in the data storage device at which the block of data is stored. Each LBA of a logical volume may be mapped (by one of the data flow controllers 104a-b) to any PBA(s) of the data storage devices 106a-h, so long as: (1) each LBA is mapped to at least one PBA, and (2) no two LBAs are mapped to the same PBA of the same data storage device.
The cache 116 does not have sufficient storage capacity to store all of the information stored by the data storage devices 106a-h. Once the cache 116 is full, if data included in a logical track for which a cache slot is not currently assigned is to be written to the cache 116, then one of the currently-assigned cache slots needs to be reassigned to store the data for the new logical track. When a slot containing write-pending data is to be reassigned to another logical track, the write-pending data is first destaged to the appropriate storage device(s) 106 to ensure that the data is not lost.
To destage a single block of write-pending data from the cache 116 to the data storage device 106a, the processor 107 programs the DMA machine 109 to access the block-sized section of the cache 116 at which the block of write-pending data is stored, and the DMA machine 109 reads this block of data and makes it available to the bus interface device 111. The processor 107 also provides the bus interface device 111 with the PBA to which the block of data should be written, and instructs the bus interface device 111 to begin an input/output (I/O) operation to destage the block of data to the data storage device 106a. During the I/O operation, the bus interface device 111 provides information to the data storage device 106a indicating the PBA at which the block of data is to be stored, and transfers the block of write-pending data from the DMA machine 109 to the data storage device 106a. 
Storage locations in data storage devices 106a-h that have consecutive PBAs are considered to be xe2x80x9ccontiguousxe2x80x9d storage locations, regardless of the physical arrangement of the storage medium on which the storage locations are disposed. Non-contiguous storage locations in data storage devices 106a-h do not have consecutive PBAs. For example, if the storage locations 110, 112 and 114 of data storage device 106a have PBAs of one, two and three, respectively, then the storage locations 110 and 112 are contiguous, storage locations 112 and 114 are contiguous, and storage locations 110 and 114 are non-contiguous.
In the system shown in FIG. 1, when the data flow controller 104a detects that several blocks of data are to be destaged from the cache 116 to contiguous storage locations of the data storage device 106a, the data flow controller 104a may destage these blocks by initiating a single (I/O) operation. To accomplish this result, the bus interface device 111 indicates to the data storage device 106a: (1) the PBA at which the data storage device 106a should begin storing the several blocks of data, and (2) the total number of blocks of data that will be transferred during the I/O operation. When the SCSI architecture is used to implement the bus 105a, the bus interface device 111 can communicate this information to the data storage device 106a by transmitting a WRITE command (e.g., SCSI operational code xe2x80x9c2Axe2x80x9d) to the storage device 106a. 
For example, if the storage locations 110, 112 and 114 have PBAs of one, two and three, respectively, then the data flow controller 104a may transfer three blocks of data from the cache 116 to the storage locations 110, 112 and 114 during a single I/O operation, as follows. First, the processor 107 can program the DMA machine 109 to make the three blocks of data available (in the proper sequence) to the bus interface device 111. Next, the processor 107 can cause the bus interface device 111 to communicate to the data storage device 106a (e.g., by issuing a WRITE command) that the first block of data being destaged is to be written to the storage location 110, and that a total of three blocks of data will be destaged during the I/O process. Finally, the processor 107 can cause the bus interface device 111 to transfer (in sequence) the three blocks of data to the data storage device 106a. 
In existing systems, such as that shown in FIG. 1, there are two known methods for destaging data from the cache 116 to non-contiguous groups of storage locations (wherein each group includes one or more contiguous storage locations) in a data storage device 106. Valid data may exist in the storage locations between the non-contiguous groups of storage locations to which data is to be destaged. Each of the known methods ensures that these intermediate storage locations are not overwritten with invalid data. A description of each of these two methods follows as it might be employed by the data flow controller 104a to destage data from the cache 116 to non-contiguous groups of storage locations of the data storage device 106a via a SCSI bus.
According to one of the two known methods, the data flow controller 104a performs a separate search of the directory 108 and initiates a separate SCSI I/O process to destage data to each of several non-contiguous groups of storage locations. Multiple searches of the directory 108 and multiple I/O processes therefore are required to destage the data according to this method. Because the directory 108 can include write-pending flags for a very large number of logical tracks of data (e.g., xe2x80x9c61,440xe2x80x9d logical tracks per logical volume), this multiple searching can be quite time-consuming. Also, the data flow controller 104a typically must arbitrate for and gain control of the bus 105a prior to performing each I/O process. Therefore, the time taken to destage data to xe2x80x9cnxe2x80x9d non-contiguous groups of storage locations includes: (1) the time taken to perform xe2x80x9cnxe2x80x9d searches of the directory 108, (2) the time taken to arbitrate for the bus xe2x80x9cnxe2x80x9d times, and (3) the time taken to perform xe2x80x9cnxe2x80x9d I/O processes that each transfers data (via the bus 105a) to one group of contiguous storage locations of the data storage device 106a. 
To address the performance problems with the above-discussed method of destaging non-contiguous blocks of data, a second method has been developed. The second method involves only a single search of the directory 108, but still requires that the data flow controller 104a arbitrate twice for the bus, and requires two I/O processes to perform the destaging. According to this second known method, the data flow controller 104a first scans the directory 108 and identifies one or more write-pending flags for logical track(s) of data. Next, by scanning the cache slot holders (not shown) in the cache 116, the data flow controller 104a identifies the block-sized sections of one or more slots of the cache 116 at which blocks of write-pending data are stored.
The data flow controller 104a next causes the bus interface device 111 to arbitrate for the bus 105a to establish a first connection with the data storage device 106a, and to initiate a first I/O process during which blocks of data are read (via the bus interface device 111 and the DMA machine 109) from storage locations between the non-contiguous groups of storage locations in the data storage device 106a. The data is read to the slot(s) in the cache 116 in which the write-pending data is stored. Thus, any valid data that is present in the intermediate storage locations between the non-contiguous groups of storage locations is transferred to the block-sized sections of the cache slot(s) between the block-sized sections in which the write-pending data is stored.
The reading of data from these intermediate storage locations to the cache 116 can be accomplished by: (1) reading data from a single group of contiguous storage locations which includes the intermediate locations, as well as the storage locations for which write-pending data exists in the cache 116, and (2) writing only the data read from the intermediate storage locations to the cache 116 so that the write-pending data is not overwritten.
After the first I/O process has completed, the data flow controller 104a causes the bus interface device 111 to arbitrate a second time for control of the bus 105a to establish a second connection with the data storage device 106a. Once control of the bus is obtained, the data flow controller 104a initiates a second I/O process during which several blocks of data are destaged from the cache slot(s) in which the write-pending data is stored to a single group of contiguous storage locations in the data storage device 106a. This group of contiguous storage locations includes not only those non-contiguous storage locations for which write-pending data originally existed in the cache 116, but also the storage locations disposed between them.
The time taken to destage data to xe2x80x9cnxe2x80x9d non-contiguous groups of storage locations according to this second method therefore includes: (1) the time taken to perform a single search of the directory 108, (2) the time taken to twice arbitrate for and gain control of the bus 105a, and (3) the time taken to perform two separate I/O processes, i.e., the first I/O process to read the data from the storage locations between the non-contiguous groups of storage locations, and the second I/O process to destage the data from the cache 116 to the single group of contiguous storage locations.
What is needed, therefore, is an improved method and apparatus for destaging data from a cache to two or more non-contiguous storage locations.
According to one aspect of the present invention, a method is disclosed for destaging data from a cache to at least one data storage device in a data storage system having a controller that controls data flow between the cache and the at least one data storage device. The method includes a step of: (a) during a single I/O operation between the controller and the at least one data storage device, destaging data from the cache to at least two non-contiguous storage locations of the at least one data storage device without overwriting at least one storage location disposed between the at least two non-contiguous storage locations.
According to another aspect of the invention, a method is disclosed for destaging data from a cache to at least one data storage device in a data storage system having a controller that controls data flow between the cache and the at least one data storage device via a bus that is shared by at least one system component in addition to the controller and the at least one data storage device. The method includes steps of: (a) establishing a communication link between the controller and the at least one data storage device via the bus; and (b) using the communication link established in step (a) to destage data over the bus from the cache to at least two non-contiguous storage locations of the at least one data storage device without overwriting at least one storage location between the at least two non-contiguous storage locations and without breaking the communication link.
According to another aspect of the invention, a method is disclosed for destaging data from a cache to at least one data storage device in a data storage system, wherein the at least one data storage device includes a plurality of storage locations. The method includes steps of: (a) transmitting the data from the cache to the at least one data storage device; and (b) transmitting information to the at least one data storage device identifying at least two storage locations of the at least one data storage device to which the data is to be written, and further identifying at least one storage location, disposed between the at least two storage locations, to which the data is not to be written.
According to another aspect of the present invention, a data storage system includes: a cache; at least one data storage device including a plurality of storage locations; at least one communication link coupled between the cache and the at least one data storage device; and means for destaging data from the cache to at least two non-contiguous storage locations of the at least one data storage device during a single I/O operation over the at least one communication link without overwriting at least one storage location disposed between the at least two non-contiguous storage locations.
According to another aspect of the invention, a data storage system includes: a cache; at least one data storage device including a plurality of storage locations; at least one communication link coupled between the cache and the at least one data storage device; means, using the at least one communication link, for transmitting the data from the cache to the at least one data storage device; and means, using the at least one communication link, for transmitting information to the at least one data storage device identifying at least two storage locations of the at least one data storage device to which the data is to be written, and further identifying at least one storage location disposed between the at least two storage locations to which the data is not to be written.
According to yet another aspect of the invention, a data storage system includes: a cache; at least one data storage device; and a data flow controller, coupled between the cache and the at least one data storage device, configured to destage data from the cache to at least two non-contiguous storage locations of the at least one data storage device during a single I/O operation without overwriting at least one storage location disposed between the at least two non-contiguous storage locations.
According to another aspect of the invention, a data storage system includes: a cache; at least one data storage device including a plurality of storage locations; at least one communication link coupled between the cache and the at least one data storage device; and a data flow controller, coupled between the cache and the at least one data storage device, to destage data from the cache to the at least one data storage device and to transmit information to the at least one data storage device identifying at least two of the plurality of storage locations to which the data is to be written, and further identifying at least one of the plurality of storage locations disposed between the at least two of the plurality of storage locations to which the data is not to be written.