Against a background of the recent trends in computer systems toward multi-processor and multi-core technologies, it has become common to use a multi-partition computer system, in which the single system is divided into a plurality of subsystems that are independent of each other. An important challenge for a multi-partition computer system is to prevent a failure that has occurred in one of its partitions from propagating to the other normal partitions.
Examples of related arts that prevent a failure in one of the partitions of a multi-partition computer system from propagating to the other normal partitions include the three methods described below.
The first example is the method disclosed in Patent Literature 1 (Japanese Patent Laying-Open No. 2006-260325). In this method, a faulty partition for which an error has been detected notifies the error to the resources that belong to own partition, e.g. a processor, memory and I/O, to cause them to stop operating, thereby preventing an incorrect packet from flowing out from the faulty partition. As means of notifying an error, the method proposes various measures, such as the one that uses a leased circuit, the one that uses a packet dedicated to the error notification function and the one that notifies via a service processor.
The second example is the method disclosed in Patent Literature 2 (Japanese Patent Laying-Open No. 2005-122229). This method provides a management table for managing the resources contained in the partitions within a common part for shared use by the partitions, for example, a crossbar. Using this table, the method monitors packets, indexes the management table based on the destination or source address, etc. of each packet and thereby prevents an incorrect packet from flowing out from the faulty partition to the normal partitions.
The third example is the method disclosed in Patent Literature 3 (Japanese Patent Laying-Open No. 2000-235558). This method provides a management table for managing the resources contained in the partitions on the respective resource sides contained in the partitions. Using a firewall, it monitors packets, indexes the management table based on the destination or source address, etc. of each packet and thereby prevents an incorrect packet from flowing out from the faulty partition to normal partitions.
Patent Literature 1: Japanese Patent Laying-Open No. 2006-260325
Patent Literature 2: Japanese Patent Laying-Open No. 2005-122229
Patent Literature 3: Japanese Patent Laying-Open No. 2000-235558
The first method disclosed in Patent Literature 1 is problematic in that a long time is required from when a failure occurred and an error is detected until when the error is notified and operations are stopped. During this period, incorrect packets may propagate to normal partitions, causing these partitions to be adversely affected. There is also the problem of high cost involved when building a system based on the method, because this method requires a complex mechanism in order to realize the error notification and operation stoppage functions.
A problem common between the second and third methods disclosed in Patent Literatures 2 and 3, respectively, is attributable to the use of a management table to manage the resources contained in partitions (e.g. processor, memory and I/O). As the system grows in size, the management table becomes larger, making it more costly to realize the respective methods. Another problem is that, as the system grows in size, setting complexity for the management table increases, leading to a higher likeliness of errors in the settings of the management table. If an error occurs in the settings of the management table, incorrect packets may propagate to normal partitions, resulting in adverse impact on these partitions. Furthermore, in cases where partition borders are changed dynamically, a problem is encountered that making changes in the management table can be extremely cumbersome.