The present invention relates generally to a method and apparatus for managing access to shared data in data processing networks.
Many of today""s mid to high-end computer systems (for example network servers and workstations) include mass storage devices configured as a redundant array in order to provide fast access to data stored on the devices and also to provide for data backup in the event of a device failure. These arrays are commonly made up of a number of magnetic disk storage devices, which are held in an enclosure and connected to the host system by an array controller unit which may take the form of either an array adapter located within the main processing unit of the computer system or alternatively a standalone array controller connected to the main processing unit. The interface between the main processing unit and the array often takes the form of one of the popular industry-standard protocols such as SCSI (Small Computer Systems Interface) or SSA (Serial Storage Architecture). In the following, the term xe2x80x98controllerxe2x80x99 will be used to encompass both array adapters and outboard controllers.
Storage arrays of this type are commonly arranged according to one or more of the five architectures (levels) set out by the RAID advisory board. Details of these levels can be found in various documentation including in the xe2x80x98RAID bookxe2x80x99 (ISBN 1-57398-028-5) published by the RAID advisory board. Three of these architectures (RAID levels 3,4 and 5) are known as parity RAID because they all share a common data protection mechanism. Two of the parity RAID levels (4 and 5) are independent access parity schemes wherein a data stripe is made up of a number of data strips or blocks and a parity strip. Each data strip is stored on one member disk of the array. In RAID level 4, the parity strips are all stored on one member of the array. In RAID level 5, the parity strips are distributed across the member disks. In contrast with the parallel access schemes, an application I/O request in an independent access array may require access to only one member disk.
To further improve reliability of the overall storage system, it is known to employ redundant controllers to provide ongoing access to data in the event one of the controllers fails. In some such architectures, one of the controllers is provided to control data transfer to and from the array of disks and the other controller is provided only as back-up in the event of failure of the main controller. Such an arrangement is wasteful of processing power.
In another, more sophisticated system, described in WO98/28685, a RAID storage subsystem comprises a plurality of controllers connected for communication with an array of disk storage devices. One of the plurality of controllers is designated as a primary controller with respect to a particular RAID logical unit (LUN) of the RAID subsystem. The primary controller is responsible for fairly sharing access to the common disk drives of the LUN among all the requesting RAID controllers. A controller desiring access to the shared disk drives of the LUN sends a message to the primary controller requesting an exclusive temporary lock of the relevant stripes of the LUN. The primary controller returns a grant of the requested lock when such exclusivity is permissible. The requesting controller then performs any required I/O operations to the shared devices and transmits a lock release to the primary controller when the operations have completed. In the event of a failure of a primary controller, another controller is described as being assigned to take its place.
The designation of one of the controllers as a primary controller has the advantage of providing a centralized mechanism for permitting exclusive access to the array. However such an arrangement has the disadvantage that it places an extra burden on the processing power of the primary controller, which may result in an imbalance in the work done by each controller. In addition, in the event of a failure of the primary controller, the work required to pass control to a replacement primary controller could be significant. Furthermore, if the primary controller were to fail while executing an I/O operation, then none of the other controllers would have knowledge of the state of that operation. In this situation, the only recourse would be to rebuild the entire array, which could lead to data loss.
Thus in data storage arrays in which a plurality of controllers are connected to a common shared array of storage devices, it would be desirable to provide a mechanism for allowing shared access to the devices which avoids some or all of the aforementioned disadvantages.
In order to address the above described deficiencies in the prior art, it is an object of the present invention to provide a mechanism that allows for shared access to storage devices by coordinating exclusive access to the storage devices.
According to a first aspect of the invention therefore, there is provided, in a data processing network including a plurality of array controllers connected for communication to each other and a plurality of data storage devices, a method for coordinating exclusive access by the plurality of controllers to a shared data region on said plurality of data storage devices. The method comprises the steps of: for each write operation to the shared data region, sending an exclusive access request from a controller desiring such access to all controllers having access to the shared data region; and, as part of granting exclusive access to the requesting controller, storing a non-volatile record of the write operation in each of the plurality of controllers having access to the shared data region.
When viewed from a second aspect the method comprises the steps of: at a first array controller, broadcasting an exclusive access request to all other array controllers having access to the shared data region and storing a non-volatile record of the write operation in the first controller; and at each controller receiving the exclusive access request, storing a non-volatile record of the write operation prior to sending an exclusive access grant to the first controller.
Thus in contrast with the technique disclosed in WO98/28685, a method for coordinating access is provided which does not involve assigning one of the controllers as a primary controller and sending all exclusive access requests to the primary controller. In accordance with the present invention, each controller is a peer and equivalent to all other controllers and if it requires exclusive access, it sends a request to all controllers on the network. Those controllers which share access to the affected region then grant access to the requesting controller. Furthermore, each controller makes a non-volatile record (e.g. parity in doubt in the case of RAID 5 I/O) of the operation which allows the regeneration of parity data in the event that there is a power failure to the adapters.
Furthermore, the present invention avoids the aforementioned problem inherent in the technique of WO98/28685, when a RAID 5 I/O is performed by the master controller. In this prior art, the primary controller does not communicate details of the I/O to another controller. Thus if the primary controller should fail then no other controller has knowledge of what I/O the old primary was performing. In the present invention, there is no primary controller and information on the RAID 5 I/O operations on each controller are communicated to the other controllers.
A preferred embodiment of the present invention will now be described, by way of example only, with reference to the accompanying drawings.