A storage array or disk array is a data storage device that includes multiple magnetic hard disk drives (HDDs) or similar persistent storage units. A storage array can allow large amounts of data to be stored in an efficient manner. A server or workstation may be directly attached to the storage array such that the storage array is local to the server or workstation. In cases in which the server or workstation is directly attached to the storage array, the storage array is typically referred to as a direct-attached storage (DAS) system. Alternatively, a server or workstation may be remotely attached to the storage array via a storage array network (SAN). In SAN systems, although the storage array is not local to the server or workstation, the disk drives of the array appear to the operating system (OS) of the server or workstation to be locally attached.
DAS systems and SAN systems are often configured as Redundant Array of Inexpensive (or Independent) Disks (RAID) systems. RAID systems use storage redundancy in order to improve storage reliability and/or in order to improve input/output (I/O) performance. In general, RAID systems simultaneously use two or more magnetic HDDs, typically referred to as physical disk drives (PDs), to achieve greater levels of performance, reliability and/or larger data volume sizes. The phrase “RAID” is generally used to describe computer data storage schemes that divide and replicate data among multiple PDs. In RAID systems, one or more PDs are set up as a RAID virtual disk drive (VD). In a RAID VD, data might be distributed across multiple PDs, but the VD is seen by the user and by the OS of the server or workstation as a single disk.
In a DAS system that is configured as a RAID system, the DAS controller functions as a RAID controller. In such a system, the RAID controller uses a portion of its local memory as cache memory. The cache memory is used for temporarily storing data that is to be written to the PDs. One type of cache memory configuration that is used for this purpose is known as a write back (WB) cache memory configuration. In WB cache memory configurations, cache commands are typically completed as soon as the data is moved into cache memory. In such configurations, maintaining the integrity of the cached data can be a challenge in the event that a failover or failback event occurs due to the fact that the data, once cached, is committed to being written to the PDs. Consequently, steps should be taken to ensure that the occurrence of a failover or failback event does not result in the cached data becoming corrupted. Stated another way, the DAS system should provide cache coherency. In order to provide cache coherency, the cached data is typically duplicated in another memory device, as will now be described with reference to FIGS. 1-3.
FIG. 1 illustrates a block diagram of a typical DAS system 2 that implements RAID technology. The system 2 includes a server 3, a RAID controller 4, and a peripheral interconnect (PCI) bus 5. The RAID controller 4 includes a central processing unit (CPU) 6, a memory device 7, and an I/O interface device 8. A portion of the storage space of memory device 7 is used as cache memory. Alternatively, the RAID controller 4 may include a separate memory device (not shown) for use as cache memory. The I/O interface device 8 is configured to perform data transfer in compliance with known data transfer protocol standards, such as the Serial Attached SCSI (SAS) and/or the Serial Advanced Technology Attachment (SATA) standards. The I/O interface device 8 controls the transfer of data to and from multiple PDs 9. The RAID controller communicates via the PCI bus 5 with a server CPU 11 and a server memory device 12. The server memory device 12 stores software programs for execution by the server CPU 11 and data.
During a typical write action, the server CPU 11 sends write request instructions via the PCI bus 5 to the RAID controller 4. The CPU 6 of the RAID controller 4 causes the data to be temporarily stored in cache memory in the memory device 7 of the RAID controller 4. The data is subsequently transferred from the memory device 7 via the I/O interface device 8 to one or more of the PDs 9. The memory device 7 contains the core logic for performing the mapping between virtual addresses of the RAID VD and physical addresses of the PDs 9. The CPU 6 of the RAID controller 4 performs calculations in accordance with the RAID level of the system 2, such as parity calculations. In the event that the current RAID level of the system 2 uses parity, the I/O interface device 8 causes the parity bits to be stored in one or more of the PDs 9.
During a typical read operation, the server CPU 11 sends a corresponding read request to the RAID controller 4 via the PCI bus 5. The RAID controller CPU 6, with use of the logic held in memory device 7, processes the request and, if the requested data is held in cache memory in the memory device 7, retrieves the requested data from cache memory of the memory device 7. If the requested data is not held in cache memory in the memory device 7, the RAID controller CPU 6 causes the requested data to be retrieved from the PDs 9. The retrieved data is transferred over the PCI bus 5 to the server CPU 11 to satisfy the read request.
FIG. 2 illustrates a block diagram of a known shared DAS system 23 that includes multiple of the RAID controllers 4 shown in FIG. 1 and the array of PDs 9 shown in FIG. 1, which are shared by the RAID controllers 4. In order to provide cache coherency in the shared DAS system 23, the data that is cached in the memory device 7 of one of the RAID controllers 4 is replicated, or mirrored, in the memory device 7 of one of the other RAID controllers 4 such that RAID controllers 4 are paired in terms of cache mirroring. Replication of the cached data is represented in FIG. 2 by arrows 24. While this type of cache coherency technique is generally effective, if a failover or failback event occurs in both RAID controllers 4 of a given pair, the integrity of the cached data for that mirrored pair is compromised.
FIG. 3 illustrates a block diagram of shared DAS system 23 shown in FIG. 2 in which cache coherency is provided by replicating the data cached in the memory device 7 of each of the RAID controllers 4 in the memory devices 7 of each of the other RAID controllers 4. Replication of the cached data is represented in FIG. 3 by arrows 24 and 25. While this type of cache coherency technique is generally effective, the physical implementation of such a technique is extremely complex and utilizes a large amount of bandwidth. In addition, as the system 23 is scaled out and larger numbers of RAID controllers 4 are added to the system 23, the complexity of the system 23 and the amount of bandwidth that is utilized for cache mirroring increase exponentially. For these reasons, this cache coherency solution is impractical in most cases.
Another solution to the cache coherency problem in a DAS system is to use a WT cache configuration instead of a WB cache configuration. However, using a WT cache configuration instead of a WB cache configuration generally degrades the I/O performance of the DAS system, and therefore is an unsuitable for many storage applications in a competitive market. While the cache coherency problem can easily be dealt with using SAN controllers, such a solution is relatively expensive, and in many cases, prohibitively expensive to implement.
Accordingly, a need exists for a DAS system that adequately protects the integrity of cached data and that overcomes the above-described limitations of known cache coherency solutions used in DAS systems.