The present invention relates to a control device for disk system devices storing data in a plurality of magnetic disk devices.
A high degree of reliability is required in electronic commerce transactions between companies and in the financial system. Core storage systems, which are central to these transactions, need to have an extremely high degree of availability. A disk control device widely used in these core storage systems is equipped with an automatic failure recovery function in which redundancy is used internally to provide high availability. If a failure takes place, the malfunctioning section is automatically separated and operations are continued with a functioning redundant section.
For example, FIG. 9 shows a well-known conventional disk control device equipped with: a plurality of host interface modules 1X performing data transfers with a host computer 60; a plurality of disk interface modules 2X performing data transfers with a magnetic disk device 70; cache memory modules 3X temporarily storing data for the magnetic disk device 70; and resource management modules 5X storing control information relating to the disk control device 104 (e.g., information relating to data transfer control between the host interface modules 1X and the disk interface modules 2X and the cache memory modules 3X, management information for data stored in the magnetic disk device 70).
The host interface modules 1X and the disk interface modules 2X and the cache memory modules 3X are connected by a data interface signal 6. In some cases, a switch 4X may be used in the connection between the host interface modules 1X and the cache memory modules 3X and between the disk interface modules 2X and the cache memory modules 3X. The host interface modules 1X, the disk interface module 2X, and the resource management modules 5X are connected by a management interface signal 7. The use of a switch in the connection between the resource management modules 5X, the host interface modules 1X, and the disk interface modules 2X is optional.
As a result, the resource management modules 5X and the cache memory modules 3X can be accessed from all the host interface modules 1X and the disk interface modules 2X.
As shown in FIG. 12, the host interface module 1X includes: a channel protocol processing module 90 processing input/output involving the host interface signal 1; an internal protocol processing module 8X processing input/output operations involving the data interface signal 6; a processor interface 17 processing input/output operations involving a management interface signal 7; a processor 14 controlling input/output operations involving the host computer 60; and a local memory 15.
The disk interface modules 2X are formed with a structure similar to that of the host interface modules except that: a disk interface signal 2 is connected to the channel protocol processing module 90 instead of the host interface signal 1; and in addition to control operations involving the host interface modules, the processor 14 also executes RAID functions.
The host interface module 1X and the disk interface module 2X communicates with the cache memory module 3X through packet transfers using packets to which the destination address is added to the start of the data.
A packet generated through control operations performed by the processor 14 in the host interface module 1X or the disk interface module 2X is sent to the switch 4X by way of the data interface signal 6. As shown in FIG. 10, the switch 4X is equipped with: multiple path interfaces 41X connected to the data interface signal 6; packet buffers 43; and address latches 44. The path interface 41X contains a header analyzing module 42X that extracts the address information from packets. The packet address analyzed and extracted in this manner is captured by the address latch 44. The sent packet is stored in the packet buffer 43 by way of the path interface 41X. A selector control signal 47 based on the packet destination is generated from the address latch 44 and the destination of the stored packet is selected by the selector 48.
At the switch 4X, the packets are transferred to the destination cache memory module 3X by way of the data interface signal 6 again. As shown in FIG. 11, the cache memory module 3X is equipped with: multiple data path interfaces 31X connected to the data interface signal 6; packet buffers 33; arbitration circuits 39; and a selector 38. The data path interface 31X includes a header analysis module 32X for extracting address information from packets. The packet address analyzed and extracted in this manner is captured by the arbitration circuit 39. The sent packet is stored in the packet buffer 33 by way of the path interface 31X. The arbitration circuit 39 selects one of the multiple data path interfaces 31X and generates a selector control signal based on the selection result. By switching the selector 38 with this selector control signal, the contents of the desired packet buffer 33 can be written to the cache memory 37 by way of the memory control circuit 35. If the packet stored in the packet buffer 33 is a memory read request, the process described above is performed in reverse to send back the contents of the specified region of the cache memory 37 to the host interface module 1X or the disk interface module 2X.
When communicating with the resource management module 5X, the host interface module 1X and the disk interface module 2X perform packet transfer operations similar to those performed with regard to the cache memory module except for the use of the management interface signal 7 instead of the data interface signal 6. The resource management module 5X is formed with a structure similar to what is shown in FIG. 11 except for the cache memory module and the interface signal.
The cache memory module 3X and the resource management module 5X are resources shared by the system and accessed by the multiple host interface modules 1X and the disk interface modules 2X, and their accessibility is a major factor in system reliability. As a result, a redundant architecture equipped with multiple elements having the same functions is provided. With this type of design, if there is a failure in one of the elements, the remaining operational elements can be used to continue operations. More specifically, if one of the processors 14 in the host interface module 1X or the disk interface module 2X detects a failure in one of the multiple cache memory modules 3X or the resource management modules 5X, the processor that detects the failure isolates the failed section, makes the remaining cache memory modules 3X or the resource management modules 5X inherit the operations thereof, and all the other processors 14 are notified of the failure. The processors receiving the failure notification update system architecture/communication routes based on the failure. This allows failed sections to be isolated in any of the host interface modules 1X and the disk interface modules 2X.
In the conventional disk control device 104 of FIG. 9, the updating of system architecture/communication routes in response to failures in shared resources, e.g., a cache memory module 3X or a resource management module 5X, is performed in a distributed manner by the processors in the multiple host interfaces 1X and the disk interfaces 2X. As a result, the handling of failures in shared resources requires complex processing, including broadcast communications to processors arranged in a distributed manner.
In another conventional technology to improve reliability in disk control devices, a failure processing mechanism provides high-availability network communication between shared system resources and system resource clients (see, e.g., Japanese laid-open patent publication number 2002-41348). As in the conventional technology described above, this conventional technology, involves updating routing tables for each of multiple processors.
Another proposed conventional technology for increasing availability of disk control devices (see, e.g., Japanese laid-open patent publication number 2000-242434) is a storage device system interposed between a host computer and a disk array subset and equipped with a switch performing address conversions between the two elements. In this conventional technology, a failure in one of multiple disk array subsets is handled by updating routes and the like by interpreting packets within the switch and modifying requests to the failed sections so that their destinations are changed to redundant sections having equivalent functions.
Failures in shared resources, e.g., cache memory modules or resource management modules, can lead to malfunctions in applications executed by the host computer and must therefore be accompanied by quick recovery operations. However, the conventional technologies shown in FIG. 9, FIG. 10, FIG. 11, and FIG. 12 all require routing changes for the host interface modules 1X and the disk interface modules 2X. This makes failure handling time-consuming, prevents continuation of read/write tasks from the host computer, and can lead to performance degradation in the storage system and malfunctions in application programs. Also, this failure processing requires high-performance processors and complex control programs in the host interface modules 1X and the disk interface modules 2X, leading to increased production costs and decreased reliability. Similar problems are involved in the case of the conventional technology described in patent document 1, since it requires changes to be made in routing tables for multiple processors.
First, in the conventional technology disclosed in the patent document 2, a switch with a function for changing packet destinations can be used so that processing within the switch can handle failures, e.g., by having multiple disk array subsets take over functions from each other. However, this involves the interpreting of the destination for each packet, requiring time-consuming processing during normal operations in addition to when a failure takes place. This leads to degraded performance in the storage system.