1. Field of the Invention
This invention relates generally to caching within a data storage subsystem and in particular to controller element(s) used as intelligent central cache apparatus within multiple redundant controller data storage subsystems.
2. Discussion of Related Art
Modern mass storage subsystems are continuing to provide increasing storage capacities to fulfill user demands from host computer system applications. Due to this critical reliance on large capacity mass storage, demands for enhanced reliability are also high. Various storage device configurations and geometries are commonly applied to meet the demands for higher storage capacity while maintaining or enhancing reliability of the mass storage subsystems.
A popular solution to these mass storage demands for increased capacity and reliability is the use of multiple smaller storage modules configured in geometries that permit redundancy of stored data to assure data integrity in case of various failures. In many such redundant subsystems, recovery from many common failures can be automated within the storage subsystem itself due to the use of data redundancy, error codes, and so-called xe2x80x9chot sparesxe2x80x9d (extra storage modules which may be activated to replace a failed, previously active storage module). These subsystems are typically referred to as redundant arrays of inexpensive (or independent) disks (or more commonly referred to by the acronym RAID). The 1987 publication by David A. Patterson, et al., from University of California at Berkeley entitled A Case for Redundant Arrays of Inexpensive Disks (RAID), reviews the fundamental concepts of RAID technology.
RAID storage subsystems typically utilize a control module that shields the user or host system from the details of managing the redundant array. The controller makes the subsystem appear to the host computer as a single, highly reliable, high capacity disk drive. In fact, the RAID controller may distribute the host computer system supplied data across a plurality of the small independent drives with redundancy and error checking information so as to improve subsystem reliability.
In some RAID configurations a portion of data is distributed across a plurality of data disk drives and associated redundancy information is added on an additional drive (often referred to as a parity drive when XOR parity is used for the redundancy information). In such configurations, the related data so distributed across plurality of drives is often referred to as a stripe. In most RAID architectures, the xe2x80x9cwritexe2x80x9d operation involves both a write of the data to the data disk and also an adjustment of parity information. The parity information adjustment may involve the reading of other data in the same stripe and writing of the newly computed parity for the blocks of the stripe. This imposes a large xe2x80x9cwrite penaltyxe2x80x9d upon RAID systems (RAID levels 3-6), often making them slower than traditional disk systems in the typical write I/O operation.
Known RAID subsystems provide cache memory structures to further improve the performance of the RAID subsystem write operations. The cache memory is associated with the control module such that the storage blocks on the disk array are mapped to blocks in the cache. This mapping is also transparent to the host system. The host system simply requests blocks of data to be read or written and the RAID controller manipulates the disk array and cache memory as required.
It is taught in co-pending U.S. patent application Ser. No. 08/772,614 to provide redundant control modules sharing access to common storage modules to improve subsystem performance while reducing the failure rate of the subsystem due to control electronics failures. In such redundant architectures as taught by co-pending U.S. patent application Ser. No. 08/772,614, a plurality of control modules are configured such that they control the same physical array of disk drives. As taught by prior designs, a cache memory module is associated with each of the redundant control modules. Each controller will use its cache during control of the data storage volume which it accesses.
In this configuration, the controllers gain the advantage of being able to simultaneously handle multiple read and write requests directed to the same volume of data storage. However, since the control modules may access the same data, the control modules must communicate with one another to assure that the cache modules are synchronized. Other communications among the cooperating controllers are used to coordinate concurrent access to the common resources. Semaphore locking and related multi-tasking techniques are often utilized for this purpose. The control modules therefore communicate among themselves to maintain synchronization of their respective, independent cache memories. Since many cache operations require the controllers to generate these synchronization signals and messages or semaphore locking and releasing messages, the amount of traffic (also referred to as coordination traffic or cache coordination traffic) generated can be substantial. This coordination traffic imposes a continuing penalty upon the operation of the data storage subsystem by utilizing valuable bandwidth on the interconnection bus as well as processing overhead within the multiple control modules. If not for this overhead imposed by coordination traffic, the data storage subsystem would have more bandwidth and processing power available for I/O processing and would thus operate faster.
In such a configuration wherein each control module has its own independent cache memory (also referred to herein as decentralized cache), there is significant duplication of the circuits and memory that comprise the cache memory on each control module. This duplication increases the complexity (and therefore the cost of manufacture) of the individual control modules. A decentralized cache architecture subsystem is scaled up by addition of control modules, each with its own duplicated cache memory circuits. This added complexity (and associated costs) therefore makes simple scaling of performance problematic.
In view of the above it is clear that a need exists for an improved cache architecture for redundant control module data storage subsystems which improves data storage subsystem performance and scalability while reducing duplication and complexity of known designs.
The present invention solves the above and other problems, and thereby advances the useful arts, by providing an intelligent central cache shared among a plurality of storage controllers in a storage subsystem. An intelligent central cache is a cache cooperatively engaged with the control modules (storage controllers) to provide caching within the storage subsystem. Various functions are performed within the intelligent central cache including storage, generation, and maintenance of cache meta-data, stripe lock functions to enable coordinated sharing of the central cache features, and functions to coordinate cache flush operations among the plurality of attached control modules.
By contrast, a xe2x80x9cdumbxe2x80x9d (unintelligent) cache, though it may be a centralized resource, is one used merely as a memory bank, typically for myriad purposes within the data storage subsystem. The intelligent cache of the present invention shares with the attached controllers much of the control logic and processing for determining, for example, when, whether, and how to cache data and meta-data in the cache memory. Cache meta-data includes information regarding the type of data stored in the cache including indications that corresponding data is clean or dirty, current or old data, and redundancy (e.g., RAID parity) data or user related data. The intelligent central cache of the present invention generates, stores, and utilizes cache meta-data for making such determinations relating to the operation of the central cache independently of and/or cooperatively with the storage controllers of the subsystem. Furthermore, the intelligent central cache of the present invention coordinates the management of non-volatility in the cache memory by coordinating with the control modules the monitoring of battery backup status, etc.
The features of the central cache are made accessed by the plurality of controllers through an application program interface (API) via inter-process communication techniques. In particular, the control modules may request, via an API function, that information be inserted or deleted from the cache. Attributes are provided by the requesting controller to identify the type of data to be inserted (e.g., clean or dirty, new or old, user data or parity, etc.). Other API functions are used to request that the central controller read or return identified data to a requesting controller. Attribute data may also be so retrieved. API functions of the intelligent central cache also assist the controllers in performing cache flush operations (such as required in write-back cache management operations). An API function requests of the central cache a map identifying the status of data blocks in particular identified stripes. The requesting control module may then use this map information to determine which data blocks in the identified stripes are to be flushed to disk. Other API functions allow the central cache to perform cache flush operations independent of requests from the attached control modules. Still other API functions provide the low level stripe lock (semaphore management) functions required to coordinate the shared access by control modules to the central cache. Details of exemplary API operations are discussed below.
The preferred embodiment of the present invention includes a plurality of control modules interconnected by redundant serial communication media such as redundant Fibre Channel Arbitrated Loops (xe2x80x9cFC-ALxe2x80x9d). The disk array control modules share access to an intelligent central cache memory (also referred to herein as a caching controller or cache control module). The caching controller is cooperatively engaged with the control modules in the data storage subsystem (also referred to herein as controllers or as host adapters to indicate their primary function within the storage subsystem) to provide intelligent management of the cache. The controllers access the caching controller to perform required caching operations relating to an I/O request processed within the controller.
This centralized cache architecture obviates the need to exchange substantial volumes of information between control modules to maintain consistency between their individual caches and to coordinate their shared access to common storage elements, as is taught by co-pending U.S. patent application Ser. No. 08/772,614. Eliminating coordination traffic within the storage subsystem frees the processing power of the several controllers for use in processing of I/O requests. Further, the reduced bandwidth utilization of the interconnecting bus (e.g., FC-AL) allows the previously consumed bandwidth to be used for data storage purposes other than mere overhead communication.
The I/O request processing power in a storage subsystem in accordance with the present invention is easily scaled as compared to known systems. In the preferred ordinary control module (host adapter) in the subsystem. The caching controller is simply populated with significant cache memory as compared to the other controllers (host adapters) which are substantially depopulated of cache memory. One skilled in the art will recognize that a limited amount of memory on each host adapter may be used for staging or buffering in communication with the central cache. Or for example, a multi-tiered cache structure may utilize a small cache on each controller but the large cache is centralized in accordance with the present invention. The controllers of the present invention are therefore simplified as compared to those of prior decentralized cache designs wherein each controller has local cache memory. Additional controllers may be added to the subsystem of the present invention to thereby increase I/O processing capability without the added complexity (cost) of duplicative cache memory.
In addition, the central cache controller of the present invention, perse, may be easily scaled to meet the needs of a particular application. First, an additional cache controller is added in the preferred embodiment to provide redundancy for the centralized cache of the subsystem. The redundant cache controllers communicate via a separate communication link (e.g., an FC-AL link) to maintain mirrored cache synchronization. Secondly, additional cache controllers may be added to the subsystem of the present invention for purposes of enlarging the central cache capacity. The additional cache controllers cooperate and communicate via the separate communication link isolated to the cache controllers. A first cache controller may perform cache operations for a first segment of the cache (mapped to a particular portion of the disk array) while other cache controllers process other segments of the cache (mapped to other portions of the disk array). Mirrored cache controllers may be added to the subsystem associated with each of the segment cache controllers.
It is therefore an object of the present invention to improve data storage subsystem performance in a data storage subsystem having a plurality of controllers.
It is another object of the present invention to improve data storage subsystem performance by providing an intelligent central cache within the data storage subsystem.
It is still another object of the present invention to improve performance in a data storage subsystem having a plurality of storage controllers by providing an intelligent central cache accessible to the plurality of storage controllers.
It is a further object of the present invention to reduce the complexity of storage controllers in a data storage subsystem having a plurality of such storage controllers by providing an intelligent central cache shared by all such storage controllers.
It is yet a further object of the present invention to improve the scalability of a data storage subsystem having a plurality of storage controllers by obviating the need for local cache memory on each such storage controller and providing an intelligent central cache shared by all such storage controllers in the subsystem.
The above and other objects, aspects, features and advantages of the present invention will become apparent from the following detailed description and the attached drawings.