1. Field of the Invention
This invention broadly relates to storage systems having at least one controller that manages data transfers between a host computer system and one or more storage devices. In particular, the present invention relates to a storage system having data transfer bandwidth independently scalable of I/O operation execution rate.
2. Description of the Related Art
Auxiliary storage devices, such as magnetic or optical disk arrays, are usually preferred for high-volume data storage. Many modem computer applications, such as high resolution video or graphic displays involving on-demand video servers, may heavily depend on the capacity of the host computer to perform in a data-intensive environment. In other words, necessity for external storage of data in relatively slower auxiliary data storage devices demands that the host computer system accomplish requisite data transfers at a rate that does not severely restrict the utility of the application that necessitated high-volume data transfers. Due to the speed differential between a host processor and an external storage device, a storage controller is almost invariably employed to manage data transfers to/from the host from/to the storage device.
The purpose of a storage controller is to manage the storage for the host processor, leaving the higher speed host processor to perform other tasks during the time the storage controller accomplishes the requested data transfer to/from the external storage. The host generally performs simple data operations such as data reads and data writes. It is the duty of the storage controller to manage storage redundancy, hardware failure recovery, and volume organization for the data in the auxiliary storage. RAID (Redundant Array of Independent Disks) algorithms are often used to manage data storage among a number of disk drives.
FIG. 1 shows a computer system 10 having a conventional storage controller 14 linking the host computer 12 with the external auxiliary storage device 16. The storage device 16 may include more than one disk drives and may also employ different RAID levels to store the data received from the host 12. The connecting links 13 and 15 may employ fibre channels, SCSI (Small Computer System Interface) interface, FC-AL (Fibre Channel Arbitrated Loop) interface, HIPPI (High Performance Parallel Interface) interface, USB (Universal Serial Bus) interface, ATM (Asynchronous Transfer Mode) interface, FireWire (High Performance Serial Bus) interface, an SSA (Serial Storage Architecture) interface or any other suitable interface standard for I/O data transfers.
As shown in FIG. 1, the conventional storage controller 14 receives every command, status and data packet during the host-requested data transfer. In other words, every binary information packet passes through the controller 14. An exemplary flow of packets during a data read operation through the conventional storage controller 14 is illustrated in FIG. 2. The data transfer protocol in FIG. 2 is typically a two-party point-to-point communication protocol, e.g. fibre channel protocol. The links 13 and 15 have been assumed to be fibre channels. However, the discussion of the general flow of packets depicted in FIG. 1 holds true for other interfaces as well.
Referring now to FIGS. 1 and 2 together, a read command identifying the storage controller 14 as its recipient (XID=A) is issued from the host 12 to the storage controller 14 over the link 13. The storage controller 14 performs necessary command translation and transmits another command packet to the storage device 16 over the link 15. The command packet from the controller 14 identifies the storage device 15 as its intended recipient (XID=B) and functions to instruct the storage device 16 to initiate the necessary data transfer, i.e. to transmit the host-requested data (as identified by the command from the controller 14). The storage drive or storage device 16 accesses the requested data and transmits data and status packets to the controller 14 over the interface link 15. The status packet may indicate to the controller 14 whether the read operation was successful, i.e. whether the data read was valid. The controller 14 then inserts its own ID (XID=A) into the received data and status packets and forwards those packets to the host 12. This completes the read operation initiated by the host 12. In the event that the status signals from the storage 16 indicate a faulty data read operation, the host 12 may reinitiate or abort the previous read transaction. In general, the packet transmission depicted in FIG. 2 is typical of a two-party point-to-point interface protocol, e.g. a fibre channel data transfer protocol.
Two parameters play a major role in measuring the performance of a storage system: (1) Input/Output (I/O) operations per second (iops), and (2) Data transfer rate. Generally, rate of execution of iops by a storage controller is governed by the type, speed and number of CPU""s within the controller. However, the data transfer rate depends on the storage controller internal bandwidth that is dedicated for data transfer. Current storage systems have restricted scalability because of the storage controllers having a relatively inflexible ratio of CPU to bandwidth capability. In other words, as shown in FIGS. 1 and 2, the data transfer between the host and the storage is made dependent on the control functions (i.e., command and status packets) executed by the storage controller. This interdependence or interlocking of iops with the data transfer results in less efficient scalability of performance parameters. For example, in the conventional storage controller architectures, an increase in the data transfer bandwidth may unnecessarily, and sometimes quite expensively, require a similar increase in the number of CPU""s residing within the controller.
Therefore, it is desirable to have a storage controller where control functionality (as measured by the iops parameter) is scalable independently of the data transfer bandwidth (which determines the data transfer rate), and vice versa. It may be further desirable to achieve independence in scalability without necessitating a change in the existing interface protocol managing the host-controller-storage interface.
The problems outlined above are in large part solved by a storage system as disclosed herein. The storage system includes a storage controller whose architecture is organized into functional units. The control function and the data transfer functions are separated so as to allow independent scalability of one or both. In other words, the storage system may be viewed as including a combination of a control module and a data switch. The command and status information may go to/from the control module, but the data may be moved directly between the host and the storage via the data switch. Thus, the control module and, hence, control functions or iops are effectively separated from the physical data transfer path. This allows data paths to be sized to the required bandwidth independently of the rate of execution of control packets by the control module or of controller bandwidth. Similarly, the number of control modules may be chosen independently of the data transfer function to meet the iops requirement.
Broadly speaking, a computer system according to the present invention includes a storage controller coupled to the host computer and the storage device. The storage controller includes a switch that links the host computer with the storage device via a control path and a data path. In one embodiment, the control and the data paths may be at least partially physically separate from each other. The control module of the storage controller is coupled to the switch through the control path. Any data transfer command from the host computer is transmitted over the control path, and, hence, passes through the control module. However, data transferred to/from the storage device is over the data path only. Therefore, the storage controller accomplishes selective scalability of data bandwidth because the data is not routed through the control module. Instead, the switch directly transfers data between the host and the storage based on the routing information supplied by the control module.
In one embodiment, the storage controller may include parity calculation logic to calculate and store parity information along with the data in the storage device. An even or an odd parity may be calculated. Other suitable error control logic, such as Error-Correcting Code (ECC) algorithms, may be employed. Parity calculation may depend on the RAID level selected by the control module. In a different embodiment, the control module may dynamically select one or more RAID levels for the data being written into the storage device. Alternatively, data may be stored among various disk drives in the storage device using a predetermined RAID level.
According to another embodiment, the storage controller includes a cache memory for read or write data caching. The cache memory may provide high-speed data storage, especially during small data transfers. Therefore, data transfer latencies may be minimized by providing such a high-speed stand-by storage through a cache memory.
The storage controller according to present invention may include a switch that allows independent scalability without any modifications or changes in the existing interface protocol. Thus, for example, the switch in the storage controller may implement necessary modifications in the data packets to comply with the existing interface data transfer protocol, e.g., the fibre channel data transfer protocol.
In an alternative embodiment, the interface protocol, e.g., the fibre channel protocol, may be implemented with a standard switch, but with a different messaging scheme. The present invention thus contemplates a computer system where the scalable storage controller is configured to operate with a standard controller switch. The control module may be configured to receive the data transfer command from the host via the standard switch. However, instead of translating the received command and forwarding it to the storage device, the control module may be configured to transmit the translated command or commands back to the host computer. The host computer may, in turn, retransmit this second set of commands provided by the control module directly to the storage device via the switch. Thus, the storage device receives commands directly from the host and responds directly to the host.
The message transfer scheme according to one embodiment may further include transmitting data transfer status information directly from the storage device to the host computer via the standard switch in the storage controller. The host computer, then, may send the transaction status to the control module, which, in turn, may respond with a final status packet to complete the data transfer cycle.