1. Field of the Invention
The present invention relates to a system control apparatus and method for implementing snooping in a large-scale information processing system.
2. Description of the Related Art
Let us discuss a multiprocessor system, which is a large-scale information processing system incorporating a large number of CPUs and I/O devices. In the following, a conventional server will be explained as an example of such a multiprocessor system. FIG. 5 is a block diagram showing an example of a conventional server architecture. This server has a plurality of SBs (system boards) 101a and 101b. Here, let us explain a server having two SBs. The SB 101a has CPUs (central processing units) 2a and 2b, IO (input/output) devices 3a and 3b, MEMs (cache memories) 4a, 4b, 4c and 4d, and an SC (system controller) 105a. Similarly, the SB 101b has CPUs 2c and 2d, IO devices 3c and 3d, MEMs 4e, 4f, 4g and 4h, and an SC (system controller) 105b. Thus, one SB contains one SC and a plurality of CPUs, IO devices and cache memories, of which the SC is in charge. The SCs 105a and 105b control memory access from their associated CPUs and IO devices and communications between the SCs.
FIG. 6 is a block diagram showing an example of a conventional SC architecture. The following is a description of the SC 105a. The SC 105b has the same architecture as that of the SC 105a. The SC 105a has a plurality of LPTs (local ports) 111, a broadcast output section 12, a broadcast input section 21, a plurality of GPTs (global ports) 122, and a snoop control section 123. The SC 105a performs snooping for the purpose of checking the status of cache memory and the status of resources related to data transfer. Let us explain the flow of an operation relating to snooping.
FIG. 7 is a timing chart showing an example of an operation relating to a fetch request in the conventional server. The abscissa axes represent time and show, from the top to the bottom, the respective operations of the CPU 2a, the CPU 2b, the SC 105a and the SC 105b. In this example, snooping starts in response to issuance of a fetch request from the CPU 2a and terminates normally.
First of all, the SC 105a receives a memory access request from a CPU or IO device that it is in charge of. Then, the SC 105a sets the memory access request to a local port 111 therein. In the example shown in FIG. 7, a fetch request is issued from the CPU 2a, and the fetch request is set to a local port 111 of the SC 105a. Further, in order to check the status of cache memories belonging to all SCs with respect to data as a target of the memory access, the broadcast output section 12 broadcasts the memory access request set at the local port 111 to all the other SCs as a broadcast request (BC request). When broadcast, the memory access request at the local port 111 is reset.
The broadcast memory access request is received by the broadcast input section 21 in each SC and set to a global port 122. In the example shown in FIG. 7, the BC request broadcast from the broadcast output section 12 in the SC 105a is received by the broadcast input section 21 in the SC 105b. In all the SCs, the identical memory access request is selected from a plurality of global ports 122, and the snoop control sections 123 in all the SCs perform snooping simultaneously. Thereafter, the SCs communicate the respective check results to each other as CST (cache status information), and the snoop control section 123 comprehensively judges the CST from all the SCs and decides the final operation for the memory access request. Snooping between the SCs is performed synchronously in all the SCs, and the CST from all the SCs is received at a fixed timing, thereby facilitating control.
Further, information concerning the memory access request is added to the CST under snooping process in order to prevent the snoop control section 123 from making an operation decision by using erroneous CST when an erroneous memory access request is selected from the global port 122 in a certain SC. The snoop control section 123 compares the memory access request information from all the SCs, thereby detecting a synchronization error in snooping between the SCs.
If the result of decision by the snoop control section 123 shows that the requested operation is unprocessable owing to various exclusive access control with respect to the target address of the memory access request or because of contention between various resources, a retry for snooping is made from the global port 122.
In the example shown in FIG. 7, the fetch request is judged to be processable by the snoop control section 123 in the SC 105a. The snoop control section 123 outputs a reset instruction to the global port 122 where the request has been set. Thereafter, the snoop control section 123 executes memory access processing according to the memory access request. In the example shown in FIG. 7, a copy request is sent to the CPU 2b finding (hitting) the data required in the cache memory. The CPU 2b reads the cache memory and then sends a response to the SC 105a. The SC 105a sends a fetch response to the CPU 2a. 
It should be noted that Japanese Patent Application Unexamined Publication (KOKAI) No. 2003-150573 (pp. 5-12, FIG. 1) is known, for example, as prior art related to the present invention.
With the conventional technique, however, error recovery is impossible when an error occurs during broadcast processing or in a case where an erroneous memory access request is selected owing to some error inside a global port that has received broadcast and output to the snoop control section, and the snoop control section of the broadcast source detects a synchronization error between SCs. In such a case, even if a retry is made from the global port as in the case of the ordinary retry processing, it is impossible to recover from the synchronization error because the cause of the error resides in the global port.