1. Field of the Invention
The present invention relates to a cluster system including plural information processing units (nodes) and a load distribution method, more particularly to a cluster system having a switch for changing the connection between each of the plurality of nodes and an I/O slot, as well as a load distribution method employed for the cluster system.
2. Description of the Related Art
In recent years, there has been increasing a demand of a cluster system including many information processing units (nodes) that enables processing to be continued without stopping even at the time of error occurrence, thereby improving the processing performance. In such a cluster system, the load distribution method, that is, how jobs/tasks are to be distributed among respective nodes becomes important.
In conventional cluster systems, the mainstream of such load distribution has been to distribute the load based on processor resources. And in order to realize a flexible load distribution system, the nodes have been required to be the same in hardware configuration. Concretely, in case where there is a storage that can be accessed only by a node, any of other nodes cannot access the storage while a load is concentrated on the node and the storage. Thus other nodes cannot process the load. In order to avoid such a trouble every node has to be capable of accessing all storages. In any of the conventional systems, however, the connection between each I/O adapter (e.g., PCI slot) and each bridge (e.g., PCI Bridge) is fixed in the I/O device configuration in each node. This is why the same number of adapters has been required to be used for all the nodes to distribute the load properly. As a result, the I/O device configuration has become very redundant and costly.
On the other hand, there has been developed a cluster system capable of changing the connection between a PCI bridge and each PCI slot freely with use of a switch provided between the PCI bridge of each node and each PCI slot to implement a more flexible I/O device configuration. In this case, because the connection between each node and each PCI slot can be changed by controlling the switch, there is no need to prepare the same number of adapters as the number of the nodes. Thus the adapters can be used efficiently. In such a cluster system, it is also expected that less adapters are used efficiently according to load changes of each node.
In any of the load distribution methods according to the conventional technology, the load to be distributed is determined according to processor resources. Consequently, if a problem arises in a transfer path of input/output data from an IO device to a processor, the load is not always distributed properly. For example, even if a load is inputted to a processor that is not used efficiently and the data transfer path leading to the IO slot of the node is in congestion, the processor might not process the load. This is why the system performance is not improved even if load distribution is made according to processor resources.
Furthermore, if an adapter card is prepared so as to enable every node to access every IO device to realize the flexible load distribution as described above, the card use efficiency falls and the cost rises. And if only the necessary number of adapters is prepared to lower the cost, it is difficult to appropriately process the load that changes from time to time.
There are some other conventional techniques disclosed in the following patent documents. JP 2002-163241A discloses a client server system that reconfigures dynamically service provider side resources according to demand changes. JP1993-089064A also discloses a computer system having a load management unit that makes communications with a host computer through the plurality of device control units, there by monitoring the load state of each of those device control units. This load management unit changes a device control unit or device that makes a communication with the host computer according to the load state of the device control unit. On the other hand, JP 1995-250085A discloses a load distribution method for buses in a data communication apparatus. This data communication apparatus includes plural modules, plural buses, and a controller for selecting a bus to be connected to a module according to a traffic volume of each module. JP 1997-016534A discloses a distribution type processing method employed for plural distributed and network-connected computers. According to this method, jobs are distributed to and executed in server processes according to the information related to hardware resources such as the static performance of each computer and the changes of the dynamic load state, as well as the information related to the hardware environment in the computer environment. Furthermore, JP 1999-065727A discloses a computer that executes load distribution among I/O buses by changing the connection of an I/O slot to a given I/O bus.