The present invention relates to a data storage system. More particularly, the present invention relates to a data storage system using a plurality of controllers together with a plurality of data unit arrays to create N-way data spans.
FIG. 1A depicts a data storage system 100 utilizing a single controller 106 as known in the art. The controller 106 is, for example, similar to the FFX controller architecture made by Mylex(trademark) of Fremont, Calif. This controller 100 provides two disk channels (118 and 128) for connecting with two fibre disk loops (102 and 104) and one host channel 130 for communications with the host system. Having an additional disk channel 104 provides additional physical drive capacity to the data storage system 100. However, the controller 106, even with the additional disk channel 104 is unable to fully utilize the bandwidth provided by the host system 108.
Looking at FIG. 1A from a workload allocation and distribution standpoint for a redundant array of independent disks (RAID) write operation, the workload is as follows. Assuming a host channel 130 and the two disk channels have a bandwidth, X, the controller 106 can sustain a maximum back end bandwidth of xc2xd(X). The is due to the fact that a host write generates four-time the back end traffic in a RAID 5 system. The controller 106 reads old data and old parity to perform a RAID 5 write, requiring two read across the disk channel, for example 102. The write operation then consists of writing the new parity data and the host write data to a drive, for example 110, requiring two writes across the disk channel 102, thereby resulting in four I/O operations across the disk channel 102. Contrary to a controller having a single disk channel which can sustain a maximum host bandwidth of xc2xc(X), the additional disk channel 104 allows the controller to increase the back end bandwidth xc2xd(X). The single controller is unable fully utilize the host channel bandwidth.
Dual active controllers were implemented to circumvent a single point of failure problem that all known single active controllers, for example as depicted in FIG. 1A, exhibit. Dual active controllers are two controllers working together to provide a greater level of fault tolerance. Typically, each controller is connected to the other controller through a special communications channel as a means of detecting whether the alternate controller has malfunctioned or failed. In the event the alternate controller fails or malfunctions, the alternate controller is held in a state that allows it no interactions with the host system, and the surviving controller assumes all of the responsibilities of the alternate controller.
Dual active controllers provide two channels (158 and 160) of communications with the host system and thus provides faster servicing of host data requests. Also, each controller (152 and 154) works together with the other controller (152 and 154) to ensure that all cached data requests are safe in the event of a controller failure. The dual active controller architecture 150 depicted in FIG. 1B, provides greater fault tolerance by handling any single controller (106 and 108) failure. In addition, the setup of this type of dual active controller architecture 150 is still very similar to the single controller setup (FIG. 1A) in that each controller works independently and does not distribute the work between the controllers. As a result, these types of configurations do not provide any load balancing.
However, the dual active controller architecture depicted in FIG. 1 suffers from several limitations. One limitation is that a single controller has a captive array of drives, usually the number of drives available in a single drive enclosure. Although enclosures (and the devices they contain) can be daisy chained together to provide more physical drive capacity, this does not address or provide any solution to the problems of controller redundancy or increased processing power. Adding additional disk storage subsystems, which still work independently, provides additional storage, but does not in itself add additional processing or data handling capabilities.
A further limitation associated with the dual active controller architecture 150 depicted in FIG. 1B is its lack of expandability. Traditionally, expanability is accomplished by providing an additional controller and a set of associated drives to a data storage system. The ideal situation would be to expand the capacity of the system drive to include the new physical drives in order to take advantage of the additional processing power provided by the new controller rather than merely using the added controller to support only the added drives.
Therefore, there remains a need to overcome the above described limitations in the existing art as well as other limitations, which are satisfied by the inventive structure and method described hereinafter.
The present invention overcomes the identified problems by providing a data storage system in which multiple controllers are used in an N-way configuration to create N-way data spans. An exemplary embodiment of the data storage system includes a plurality of controllers including at least one master controller in a master/slave N-way controller topology. The master controller is coupled to a host system via a communications loop, and each controller is operatively coupled to one of a plurality of data unit arrays. The plurality of data unit arrays each include a plurality of disk units that are linked together. The linked disk units appear as a continuous logical unit and each data unit array forms a data span, such that the plurality of data unit arrays form N-way data spans. Each controller is adapted to transfer data between the data units and the master controller in response to instructions therefrom based on a data configuration. The data is then transferred between the master controller and the host system. In addition, the master controller is adapted to balance input/output (I/O) requests amongst the plurality of controllers and re-direct an I/O request directed to a failed or malfunctioning controller to an active controller. Together, the plurality of controllers and the plurality of data unit arrays appear as a continuous system drive to the host system.
In a further embodiment, the data storage system includes a plurality of controllers in a peer-to-peer N-way controller topology. A peer-to-peer topology allows any one active controller to take over the functions of a failed or malfunctioning controller without any interruption to the host system or data loss. The plurality of controllers are each coupled to the host system via a communications loop and operatively coupled to each of a plurality of data unit arrays. Any one active controller is adapted to transfer data between the data units and the host system in response to instructions therefrom based on a data configuration. In addition, any one active controller is adapted to balance I/O requests amongst the plurality of controllers and re-direct an I/O request directed to a failed controller to an active controller. Together, the plurality of controllers and the plurality of data unit arrays appears as a continuous system drive to the host system. Alternatively, a master controller is added to the peer-to-peer N-way controller topology to create a hybrid (master/slave and peer-to-peer) controller topology.
Advantages of the invention include automatic copying of the host data to an alternate controller for data protection. In addition, if the spans are setup as a RAID level 0+5 or some other similar configuration, the workload is automatically distributed among the various controllers. (RAID 0+5 refers to a multiple RAID configuration in which data is transferred to the master or any one active controller in a RAID 0 format and written to the data units in a RAID 5 configuration.)