The present invention relates to a disk controller for controlling a plurality of disk drives, and more particularly to a high reliability disk controller using connection-less type multiplex communication. The present invention further relates to a storage system which can expand the configuration scalably from small scale to large scale.
U.S. Pat. No. 6,601,134 and No. 2003046460 disclose a storage system. A disk sub-system (hereinafter simply called a “sub-system”) using magnetic disk drives as storage media has an input/output performance lower by three to four digits than that of a main storage of a computer using semiconductor storages as storage media. A lot of effort has been put into reducing this difference, i.e., improving the input/output performance of the sub-system. One method of improving the input/output performance of the sub-system is to use a disk controller which controls a plurality of magnetic disk drives into which data is distributively stored.
For example, a conventionally known disk controller such as shown in FIG. 16 has a plurality of channel adapters 2100 which execute data transfer between a host computer and a disk drive; a plurality of cache memory adapters 2300 for temporarily storing data to be transferred between the host computer and disk drive; a plurality of control memory adapters 2301 for storing control information on the operation of the disk controller; and a plurality of switch adapters 2400 for establishing connections between the cache memory adapters and channel adapters. The channel adapters 2100 and cache memory adapters 2300 are interconnected by a data system inner network via the switch adapters 2400. The channel adapters 2100 and control memory adapters 2301 are interconnected by a control system inner network. With these network connections, all the channel adapters 2100 can access the cache memory adapters 2300 and control memory adapters 2301.
Each channel adapter 2100 has: data link engines (DLEs) 2110 for executing packet transfer in the data system internal network; DMA controllers (DMACs) 2120 for executing DMA transfer in the data system inner network; a selector 2115 for interconnecting DLEs 2110 and DMACs 2120; protocol engines (PE) 2130 for controlling communication between the host computer and disk drive; ports 2140 for connection to the host computer or disk drive; DLEs 2210 for executing packet transfer in the control system inner network; DMACs 2220 for DMA transfer in the control system inner network; micro-processors (MPs) 2230 for controlling the operation of the disk controller; and a selector 2125 for interconnecting DMACs 2120 and PEs 2130 or MPs 2230.
The cache memory adapter 2300 and control memory adapter 2301 each have: DLEs 2310 for executing DMA transfer in the data system internal network or control system internal network; DMACs 2320 for executing DMA transfer in each inner network; memory controllers (MCs) 2330; memory modules (MMs) 2340; a selector 2315 for interconnecting DLEs 2310 and DMACs 2320; and a selector 2325 for interconnecting DMACs 2320 and MCs 2330.
The switch adapter 2400 has: DLEs 2410 for executing packet transfer in the data system inner network; DMACs 2420 for executing DMA transfer in the data system inner network; and a selector 2430 for interconnecting DMACs 2420.
Data transfer between the adapters is realized by cooperative operations of DMACs in the respective adapters. As an example of this, with reference to FIGS. 18 and 19, description will be made on an outline operation of DMA transfer of data from the host computer to the cache memory adapter 2300 in the disk controller.
When a WRITE request is issued from the host computer via the connection port 2140, MP 2230 calculates an area of the cache memory adapter for temporarily storing WRITE data, and notifies the calculated result to DMAC 2120 in the channel adapter as a DMA list 2600. DMAC 2120 issues requests 2605 for acquiring paths to the cache memory adapters necessary for DMA transfer. Since the WRITE data is stored in a plurality of cache memory adapters (two cache memory adapters having DMAC 2321 and DMAC 2322) in order to improve the reliability, a plurality of path establishing requests are issued. After necessary paths are established, DMAC 2120 transfers the WRITE data to DMAC 2420 at the relay point switch, in accordance with the contents of the DMA list 2600. In this case, the WRITE data is transferred from the host computer by dividing it into a data amount having a predetermined size.
DMA 2420 of the switch adapter 2400 generates DMA sub-requests 2611 and 2612 for DMACs 2321 and 2322 of the cache memory adapters, in accordance with the transfer requests sent from DMAC 2120 of the channel adapter 2100. In response to the requests 2611 and 2612, DMACs 2321 and 2322 return sub-statuses 2621 and 2622 which are the request completion notices. After DMAC 2120 of the channel adapter confirms the sub-statuses 2621 and 2622, it issues the next DMA sub-request. When the sub-statuses of all the DMA sub-requests are returned, DMAC 2120 issues release requests 2625 for the established paths to the cache memory adapters, and returns a completion status 2630 to MP 2230 to thereby complete the process for the DMA list 2600. During the DMA transfer, MP 2230 accesses the control memory adapter 2301 when necessary. In this case, similar DMA transfer is performed between DMAC 2220 of the channel adapter 2100 and DMAC 2320 of the control memory adapter 2301.
FIG. 17 shows the structure of a packet used by DMA transfer. A command packet 2520 has: an address field 2521 for indicating a targeting DMAC; an address field 2522 for indicating an initiating DMAC; memory address fields 2523 and 2524 for indicating memory addresses at which transfer data is stored; and an error check code 2525.
The path establishing request 2605 is issued by using the command packet 2520. A data packet 2530 has: an address field 2531 for indicating a targeting DMAC; an address field 2532 for indicating an initiating DMAC; transfer data 2533; and an error check code 2535. The DMA sub-request is issued by using the data packet 2530.
FIG. 20 illustrates a transfer protocol for the path request command 2605 and DMA sub-request 2610. In order to facilitate a failure recovery process, processes are all executed by non-multiplex communication. Namely, after it is confirmed that the sub-status 2620 for the DMA sub-request 2610 is returned, the next DMA sub-request 2610 is issued.
Storage systems for storing data to be processed by information processing systems are now playing a central role in information processing systems. There are many types of storage systems, from small scale configurations to large scale configurations.
For example, the storage system with the configuration shown in FIG. 40 is disclosed in U.S. Pat. No. 6,385,681. This storage system is comprised of a plurality of channel interface (hereafter “IF”) units 5011 for executing data transfer with a computer (hereafter “server”) 5003, a plurality of disk IF units 5016 for executing data transfer with hard drives 5002, a cache memory unit 5014 for temporarily storing data to be stored in the hard drives 5002, a control information memory unit 5015 for storing control information on the storage system (e.g., information on the data transfer control in the storage system 5008, and data management information to be stored on the hard drives 5002), and hard drives 5002. The channel IF unit 5011, disk IF unit 5016 and cache memory unit 5014 are connected by the interconnection 5041, and the channel IF unit 5011, disk IF unit 5016 and control information memory unit 5015 are connected by the interconnection 5042. The interconnection 5041 and the interconnection 42 are comprised of common buses and switches.
According to the storage system disclosed in U.S. Pat. No. 6,385,681, in the above configuration of one storage system 5008, the cache memory unit 5014 and the control memory unit 5015 can be accessed from all the channel IF units 5011 and disk IF units 5016.
In the prior art disclosed in U.S. Pat. No. 6,542,961, a plurality of disk array system 4 are connected to a plurality of servers 5003 via the disk array switches 5005, as FIG. 41 shows, and the plurality of disk array systems 5004 are managed as one storage system 5009 by the means for system configuration management 5060, which is connected to the disk array switches 5005 and each disk array system 5004.