1. Field of the Invention
This invention relates in general to mass storage systems, and more particularly to a method and apparatus for using cache coherency locking to facilitate on-line volume expansion in a multi-controller storage system.
2. Description of Related Art
Typically, a computer stores data within storage devices such as hard disk drives, floppy drives, tape, compact disk, etc. Modern mass storage subsystems are continuing to provide increasing storage capacities to fulfill user demands from host computer system applications. Due to this critical reliance on large capacity mass storage, demands for enhanced reliability are also high. Various storage device configurations and geometries are commonly applied to meet the demands for higher storage capacity while maintaining or enhancing reliability of the mass storage subsystems. If a large amount of data requires storage, then multiple devices are connected to the computing system and utilized to store the data.
A popular solution to mass storage demands for increased capacity and reliability is the use of multiple smaller storage modules configured in geometries that permit redundancy of stored data to assure data integrity in case of various failures. In many such redundant subsystems, recovery from many common failures can be automated within the storage subsystem itself due to the use of data redundancy, error codes, and so-called “hot spares” (extra storage modules which may be activated to replace a failed, previously active storage module). These subsystems are typically referred to as redundant arrays of inexpensive (or independent) disks (or more commonly by the acronym RAID). The 1987 publication by David A. Patterson, et al., from University of California at Berkeley entitled A Case for Redundant Arrays of Inexpensive Disks (RAID), reviews the fundamental concepts of RAID technology.
There are five “levels” of standard geometries defined in the Patterson publication. The simplest array, a RAID 1 system, comprises one or more disks for storing data and a number of additional “mirror” disks for storing copies of the information written to the data disks. The remaining RAID levels, identified as RAID 2, 3, 4 and 5 systems, segment the data into portions for storage across several data disks. One of more additional disks are utilized to store error check or parity information.
A computing system typically does not require knowledge of the number of storage devices that are being utilized to store the data because another device, the storage subsystem controller, is utilized to control the transfer of data to and from the computing system to the storage devices. The storage subsystem controller and the storage devices are typically called a storage subsystem and the computing system is usually called the host because the computing system initiates requests for data from the storage devices. The storage controller directs data traffic from the host system to one or more non-volatile storage devices. The storage controller may or may not have an intermediate cache to stage data between the non-volatile storage device and the host system.
A caching storage controller is a device which is capable of directing the data traffic from a host system to one or more non-volatile storage devices which uses an intermediate data storage device (a cache memory) to stage data between the non-volatile storage device and the host system. In general, the intermediate storage device includes RAM to allow a quicker access time to the data. Furthermore, it provides a buffer in which to allow exclusive-or (XOR) operations to be completed for RAID 5 operations.
A multi-controller system is defined as a collection of controllers or caching storage controllers which work in a cooperative manner with each other. They provide the ability for recovering from a controller failure by allowing multiple paths to a volume set. The volume set is a contiguous range of randomly accessible sectors of data. For practical purposes, the sector numbering typically starts at 0 and goes to N, where N is the total number of sectors available to the host system. A data extent is a range of data within a volume set delineated by a starting sector and an ending sector. The volume set is broken up into a number of data extents which are not required to be of the equivalent sizes, but may not overlap. These concepts are used in the discussion of the background and the detailed description of embodiments of the invention, and apply to both.
Existing storage system control methodologies include incidental tasks that operate on user data, e.g., rebuilding volume set data to a spare storage device after a device failure, on-line expansion of a volume set, volume set parity checking, snapshot backup, volume set initialization, etc. Typically, in a dual active controller system, one controller acts as the master performing the task itself while locking the affected data to prevent access by the slave, or while holding the slave controller in reset during the task, and continuing to perform its primary mission of servicing user I/O requests. However, the performance of the system is diminished because of the time allotted to the primary controller to execute the task.
It is desirable to provide a method and apparatus wherein a volume set of storage devices are able to be expanded without taking the storage devices off line. It is also desirable for the data stored in the storage devices to be continuously accessible by multiple controllers during a volume set expansion. It is further desirable for data being expanded to be accessible to multiple controllers in at least some form during the data expansion process. It is desirable to provide a multi-controller relationship that permits data access to multiple controllers continuously and simultaneously during a volume storage device set expansion.
It can be seen that there is a need for a method and apparatus for using cache coherency locking to facilitate on-line volume expansion in a multi-controller storage system.