1. Field of the Invention
This invention relates to data storage systems, and more particularly to data storage systems having a storage device controller interposed between a host computer and one or more data storage devices wherein the controller manages the storage of data within the one or more storage devices.
2. Description of the Related Art
Auxiliary storage devices such as magnetic or optical disk arrays are usually preferred for high-volume data storage. Many modern computer applications, such as high resolution video or graphic displays involving on-demand video servers, may heavily depend on the capacity of the host computer to perform in a data-intensive environment. In other words, necessity for external storage of data in relatively slower auxiliary data storage devices demands that the host computer system accomplish requisite data transfers at a rate that does not severely restrict the utility of the application that necessitated high-volume data transfers. Due to the speed differential between a host processor and an external storage device, a storage controller is almost invariably employed to manage data transfers to/from the host and from/to the storage device.
The purpose of a storage controller is to manage the storage for the host processor, leaving the higher speed host processor to perform other tasks during the time the storage controller accomplishes the requested data transfer to/from the external storage. The host generally performs simple data operations such as data reads and data writes. It is the duty of the storage controller to manage storage redundancy, hardware failure recovery, and volume organization for the data in the auxiliary storage. Redundant array of independent disks (RAID) algorithms are often used to manage data storage among a number of disk drives.
FIG. 1 is a diagram of a conventional computer system 10 including a host computer 12 coupled to a storage controller 14 by an interconnect link 16, and two storage devices 18A-18B coupled to storage controller 14 by respective interconnect links 20A and 20B. Each storage device 18 may be, for example, a disk drive array or a tape drive. Links 16 and 20A-20B may include suitable interfaces for I/O data transfers (e.g., Fibre Channel, small computer system interface or SCSI, etc.) As evident in FIG. 1, all of the information involved in data transfers between host computer 12 and storage devices 18A-18B passes through storage controller 14. Storage controller 14 receives command, status, and data packets during the data transfer.
FIG. 2 is a diagram illustrating an exemplary flow of control and data packets during a data read operation initiated by host computer 12 of FIG. 1. Links 16 and 20A-20B in FIG. 1 may be Fibre Channel links, and the data transfer protocol of FIGS. 1 and 2 may be the Fibre Channel protocol. Referring to FIGS. 1 and 2 together, host computer 12 issues a read command packet identifying storage controller 14 as its destination (XID=H,A) via link 16. Storage controller 14 receives the read command and determines that two separate read operations are required to obtain the requested data; one from storage device 18A and the other from storage device 18B.
Storage controller 14 translates the read command from host computer 12 into two separate read commands, one read command for storage device 18A and the other read command for storage device 18B. Storage controller 14 transmits a first read command packet identifying storage device 18A as its destination (XID=A,B) via link 20A, and a second read command packet identifying storage device 18B as its destination (XID=A,C) via link 20B. Each read command packet instructs respective storage devices 18A-18B to access and provide data identified by the read command. Storage device 18A (ID=B) accesses the requested data and transmits a data packet followed by a status packet (XID=B,A) to storage controller 14 via link 20A. Storage device 18B (ID=C) accesses the requested data and transmits a data packet followed by a status packet (XID=C,A) to storage controller 14 via link 20B. Each status packet may indicate whether the corresponding read operation was successful (i.e. whether the data read was valid).
Storage controller 14 typically includes a memory unit, and temporarily stores data and status packets in the memory unit. Storage controller 14 then consolidates the data received from storage devices 18A-18B and processes the status packets received from storage devices 18A-18B to form a composite status. Storage controller 14 transmits the consolidated data followed by the composite status (XID=A,H) to host computer 12 via link 16, completing the read operation. In the event that the composite status indicates a read operation error, host computer 12 may ignore the consolidated data and initiate a new read operation. In general, the flow of packets depicted in FIG. 2 is typical of a two-party point-to-point interface protocol (e.g., the Fibre Channel protocol).
As indicated in FIG. 1, storage controller 14 includes multiple communication ports. In addition to the memory and the multiple communication ports, storage controller 14 also typically includes one or more central processing units (CPUs). The multiple communication ports and the CPUs may be coupled to a communication bus. The CPUs and the memory may be coupled to a common bus within storage controller 14, and the CPUs may access the memory via the bus.
Two parameters are commonly used to measure the performance of a storage system: (1) the number of input/output (I/O) operations per second (IOPS), and (2) the data transfer rate of the storage system. Generally, the rate of execution of I/O operations by a storage controller is governed by the type, speed and number of CPUs within the storage controller. The data transfer rate depends on the data transfer bandwidth of the storage controller. In computer system 10 described above, all of the data transferred between host computer 12 and storage devices 18A-18B is temporarily stored within the memory of storage controller 14, and thus travels through the bus of storage controller 14. As a result, the data transfer bandwidth of storage controller 14 is largely dependent upon the bandwidth of the bus of storage controller 14.
Current storage systems have restricted scalability because of the storage controllers having a relatively inflexible ratio of CPU to bandwidth capability. This is especially true if they are based on xe2x80x9coff-the-shelfxe2x80x9d microprocessors or computer systems. Usually the storage controller is designed to satisfy the majority of IOPS and data rate performance requirements with one implementation. This interdependence between IOPS and data transfer rate results in less efficient scalability of performance parameters. For example, in conventional storage controller architectures, an increase in data transfer rate may require both an increase in data transfer bandwidth and an increase in the number of CPUs residing within the controller.
It would thus be desirable to have a storage system wherein control functionality (as measured by the IOPS parameter) is scalable independently of the data transfer bandwidth (which determines the data transfer rate), and vice versa. It may be further desirable to achieve independence in scalability without necessitating a change in the existing communication protocol used within the storage system.
One embodiment of a transfer node is described, including a first channel port adapted for coupling to a host computer, a second channel port adapted for coupling to a storage controller and one or more storage devices, a central processing unit (CPU) coupled to the first and second channel ports, and a memory coupled to the CPU. The transfer node receives data routing information associated with a data transfer command from the storage controller via the second channel port, wherein the data transfer command directs a transfer of data between the host computer and the one or more storage devices. The transfer node stores the data routing information within the memory, and routes data associated with the data transfer command between the first and second channel ports using the data routing information stored within the memory. For example, when the data transfer command is a read command, the transfer node receives data associated with the data transfer command from the one or more storage devices via the second channel port, routes the data from the second channel port to the first channel port using the data routing information stored within the memory, and forwards the data to the host computer via the first channel port. As a result, the data associated with the data transfer command is routed between the host computer and the one or more storage devices such that the data does not pass through the storage controller, allowing independent scalability of a number of input/output operations per second (IOPS) and a data transfer rate of a storage system including the transfer node. Several embodiments of a computer system are described, wherein each embodiment of the computer system has a storage system including the transfer node coupled in series with a switch between the host computer and the one or more storage devices.
In one embodiment, the memory of the transfer node includes a first lookup table and a data buffer area, and the data routing information includes command identification information uniquely identifying the data transfer command and one or more pointers to data buffers within the data buffer area. The command identification information and the pointers are stored within the first lookup table. The transfer node is configured to store the data associated with the data transfer command within the data buffers using the pointers. When the data transfer command is a read command, the data is received from the one or more storage devices at the second channel port, and the transfer node routes the data from the data buffers to the first channel port when all of the data associated with the data transfer command has been received by the transfer node. On the other hand, when the data transfer command is a write command, the data is received from the host computer at the first channel port, and the transfer node routes the data from the data buffers to the second channel port when all of the data associated with the data transfer command has been received by the transfer node.
In a second embodiment, the memory includes a second lookup table, and the data routing information includes target header information and corresponding substitute header information. The target header information includes at least a portion of a header field of an expected data frame, and the substitute header information includes header information to be substituted for the header information of the expected data frame. The target header information and corresponding substitute header information are stored within the second lookup table. The transfer node is configured to: (i) compare header information of a data frame received at either the first channel port or the second channel port to the target header information within the second lookup table, (ii) replace the header information of the data frame with the substitute header information corresponding to the target header information if the header information of the data frame matches the target header information, and (iii) route the data frame to the other channel port.
For example, when the data transfer command is a read command, the transfer node compares header information of a data frame received at the second channel port to the target header information within the second lookup table, replaces the header information of the data frame with the substitute header information corresponding to the target header information if the header information of the data frame matches the target header information, and routes the data frame to the first channel port.
In a preferred embodiment, the transfer node also receives the data transfer command from the host computer via the first channel port and forwards the data transfer command to the storage controller via the second channel port. In the preferred embodiment, the transfer node also receives a translated data transfer command from the storage controller via the second channel port and forwards the translated data transfer command to the one or more storage devices via the second channel port.
Several embodiments of a computer system are described, wherein each embodiment of the computer system has a storage system including a switch coupled to one or more storage devices, the above described transfer node coupled between a host computer and the switch, and a storage controller coupled to the transfer node. The storage controller may be coupled to the switch, or coupled directly to the transfer node. The transfer node receives a data transfer command from the host computer and forwards the data transfer command to the storage controller. The storage controller receives the data transfer command, generates a translated data transfer command and data routing information in response to the data transfer command, and forwards the translated data transfer command and the data routing information to the transfer node. The transfer node forwards the translated data transfer command to the one or more storage devices, stores the data routing information within the memory, and uses the data routing information to route data associated with the data transfer command between the host computer and the one or more storage devices such that the data does not pass through the storage controller. As a result, the storage controller is removed from a data path between the host computer and the one or more storage devices, allowing independent scalability of the IOPS and data transfer rate of the storage system.
One method for routing data within a storage system includes coupling a transfer node between a host computer and a switch, wherein the switch is coupled to one or more storage devices, and wherein the transfer node includes a memory for storing data routing information. A storage controller is coupled to the transfer node. Data routing information associated with a data transfer command is forwarded from the storage controller to the transfer node. The data routing information is stored within the memory of the transfer node, and used to route data associated with the data transfer command between the host computer and the one or more storage devices such that the data does not pass through the storage controller.
A second method for conveying data within storage system having a storage controller and a transfer node includes the transfer node receiving a data transfer command from a host computer coupled to the transfer node. The transfer node conveys the data transfer command to the storage controller. The storage controller generates data routing information dependent upon the data transfer command, and conveys the data routing information to the transfer node. The transfer node receives data for the data transfer command from one or more storage devices, and forwards the data to the host computer so that the data is conveyed from the one or more storage devices to the host computer without being conveyed through the storage controller.
A third method for conveying data within the storage system described above includes the transfer node receiving a data transfer command from a host computer coupled to the transfer node. The transfer node conveys the data transfer command to the storage controller. The storage controller translates the data transfer command into one or more device data transfer commands, and conveys the device data transfer commands to the transfer node on the controller bus. The transfer node forwards the device data transfer commands to one or more storage devices. The transfer node receives data in response to the device data transfer commands and forwards the data to the host computer so that the data is conveyed from the one or more storage devices to the host computer without being conveyed through the storage controller.