Conventionally, partition technology is known as a method for configuring a server. With partition technology, resources (CPUs; chip sets including a memory controller, an I/O controller, etc.; memories; I/Os; etc.) of a server are divided into a plurality of partitions, in each of which an OS (operating system) and an application on the OS can be operated independently.
Partitions exist in three forms: a physical partition, a logic partition, and a resource partition. Among these forms, the physical partition is a form for electrically dividing the entire server into units of system boards (hereinafter referred to as SBs).
In the server, a plurality of physical partitions can be operated. The minimum configuration unit of physical partitions is an SB. Each SB can be operated as an independent server. Each physical partition is completely electrically divided. Accordingly, physical partitions have an advantage in that a hardware fault of one partition does not affect other partitions. Each SB is a board on which a CPU, a memory, a chip set, etc. are mounted, and can be inserted into or extracted from the housing (rack, etc.) of partitions.
In the meantime, a plurality of logic partitions can be operated in the server. Each logic partition includes a logic block that can independently operate an OS. Each logic block includes a CPU; a chip set including a memory controller, an I/O controller, etc.; a memory; and the like. If the CPU is a multi-core CPU such as a CMP (Chip Multi-Processor), etc., a logic partition can be configured in units of CPU cores as a minimum configuration unit of a logic block.
FIG. 1 illustrates an example of a system configuration of a server taking a physical partition form.
The server 10 illustrated in FIG. 1 includes an MMB (Management Board) 11 that is a kind of a service processor (SVP) as a system management unit of an information processing apparatus, eight SBs 12 (SB#0 to SB#7), eight I/O boards 13 (IOU#0 to IOU#7), a cross-bar switch 14, an SMBus (System Management Bus) 15, and the like. The MMB 11, the SBs 12, and the I/O boards 13 are interconnected by the SMBus 15 and the like. The SMBus 15 and the like are also connected to the cross-bar switch 14, which is connected to all the SEs 12 and the I/O boards 13 within the system.
In the server 10 configured as described above, the eight SBs 12 and the eight I/O boards 13 configure one physical partition (hereinafter denoted as a partition).
The MMB 11 has the configuration information of the partition, and sets a partition ID (PID) in each of the SBs 12 and each of the I/O boards 13 before the SBs 12 and the I/O boards 13 are activated. This setting can be only made for one SB 12 or one I/O board 13 at a time.
The SBs 12 and the I/O boards 13 can exchange data via the cross-bar switch 14. This data exchange is made with a packet. When transmitting the packet, the SBs 12 and the I/O boards 13 assign a partition ID. The SBs 12 and the I/O boards 13 receive a packet transmitted from the other SBs 12 and I/O boards 13 via the cross-bar switch 14, and ignore a partition ID assigned to the packet if the assigned partition ID is not the same as the partition ID of the local SB 12 or I/O board 13.
A function to dynamically reconfigure the system by adding (inserting), replacing, or removing (extracting) an SB 12 or an I/O board 13 during partition operations in the server configured as described above is called dynamic reconfiguration (DR). In dynamic reconfiguration, the registers of chip sets of the SBs 12 or the I/O boards 13 are required to match in order to maintain the coherency of the system.
In the meantime, a function to dynamically reconfigure the system by adding (inserting), replacing or removing (extracting) an SB 12 or an I/O board 13 during system suspension or partition suspension at a power halt is called static reconfiguration (SR).
“Reconfiguration” is assumed to include both the dynamic reconfiguration (DR) and the static reconfiguration (SR).
Each SB 12 includes a register the value of which varies according to data flowing during system operations. The dynamic reconfiguration (hereinafter referred to as DR) is the function to add, replace, remove, etc. an SB 12 or an I/O board 13 during partition operations. For example, if an SB 12 is newly added to a partition, the value of the register (of the chip set) within the SB 12 to be added and that of the register (of the chip set) within the currently operating SB 12 in the partition mismatch. Accordingly, to implement DR, the value of the register within the SB 12 to be newly added to the partition and that of the register within the currently operating SB 12 in the partition are required to match at the same timing.
To make the values match, a method for simultaneously rewriting the values of the registers by the MMB 11 is considered. As described above, however, the MMB 11 cannot simultaneously rewrite the values of the plurality of registers.
FIG. 2 illustrates an example of a hardware configuration of a server taking a conventional physical partition form.
The server illustrated in FIG. 2 includes two partitions 100 (Partition #0) and 200 (Partition #1), an MMB 400, a switch 500, and a cross-bar switch 600. The partition 100 accommodates three SBs 110, 120 and 130. The partition 200 accommodates one SB 210. All the SBs 110, 120 and 130 within the partition 100 have the same configuration. Accordingly, the configuration of the SB 110 is described here. For reference numerals assigned to the components of the SBs 120 and 130, a sub number (such as the “01” of partition ID holding circuit 113-01) hyphenated to a main number (such as the “113” of the partition ID holding circuit 113-01) is changed so that the components of the SBs can be distinguished as illustrated in FIG. 2.
The SB 110 includes a register 111R, the partition ID holding circuit 113-01, a decoder 114-01, a packet issue timing circuit 115-01, a packet issue circuit 116-01, a packet arbiter 117-01, a decoder 118-01, and a to-different-circuit 119-01 (hereinafter denoted as a different circuit 119-01). In FIG. 2, a register 121R of the SB 120, a register 131R of the SB 130, and a register 211R of the SB 210 are denoted with different reference numerals in terms of their relationship with the descriptions of FIGS. 3 to 5 to be described later. However, these registers have the same configuration from a hardware viewpoint.
The register 111R is a register within the chip set. This is the register required to be initialized when the system is reconfigured (regardless of whether it is dynamically or statically) by newly inserting an SB into or extracting an SB from a partition to which the SB including the chip set belongs. This register 111R is cleared by an externally input reset signal (system reset signal). The partition ID holding circuit 113-01 holds a partition ID that is assigned to each partition by the MMB 500 before the SB 110 is activated. The partition ID holding circuit 113-01 is, for example, a register. The packet issue timing circuit 115-01 instructs the packet issue circuit 116-01 of a packet to be issued. The packet issue circuit 116-01 generates the packet corresponding to the instruction, and outputs the generated packet to the packet arbiter 117-01. The packet arbiter 117, to which packets from the packet issue circuit 116-01 and a different circuit (not illustrated) are input, arbitrates the packets according to their priorities, etc. Then, the packet arbiter 117 transmits the packets to an arbiter 601 provided within the cross-bar switch 600 according to arbitration results.
The arbiter 601 receives the packets from packet arbiters 117 of the SBs within the system, and arbitrates the packets according to their priorities, etc. Then, the arbiter 601 transmits the packets to the SBs within the system according to arbitration results. The transmission of the packets is made, for example, by broadcasting.
The decoder 114-01 of the SB 1110 receives a packet transmitted from the arbiter 601, and determines whether or not the packet is addressed to the local SB. This determination is made by comparing the partition ID assigned to the received packet with the partition ID held in the partition ID holding circuit 113-01. If both of the IDs match, the decoder 114-01 determines that the received packet is the packet addressed to the local SB. If the received packet is the packet addressed to the local SB, the decoder 114-01 transmits the packet to the different circuit 119-01. If the received packet is not the packet addressed to the local SB, the decoder 114-01 discards the packet. The decoder 118-01 receives an instruction transmitted from the MMB 400 via the switch 500. Then, the decoder 118-01 decodes the instruction to generate a control signal, and outputs the control signal to the different circuit 119-01. The different circuit 119-01 executes the process corresponding to the control signal.
The MMB 400 is a unit for managing the system, and manages information (system configuration information) about the configuration of the system, such as configuration information of each partition within the system, and the like. The MMB 400 sets a partition ID in each SB or I/O board (not illustrated) before the SB and the I/O board are activated. This setting is made via the switch 500. Namely, the MMB 400 outputs, to the switch 500, an instruction to set a partition ID in each SB and each I/O board within the system. This instruction is sequentially issued to the individual SBs and I/O boards, and transmitted by the switch 500 to the SBs and the I/O boards within the system. Moreover, the MMB 400 sets or updates the value of the register of each SB and each I/O board within the system. The setting or updating of the value of the register is also made by individually transmitting the instruction to the SBs and the I/O boards via the switch 500.
The switch 500 transmits the instruction issued from the MMB 400 to the SBs within the partitions via the SMBus, etc. (not illustrated). The cross-bar switch 600 is a communication path for exchanging a message between SBs and between an SB and an I/O board. The cross-bar switch 600 includes the arbiter 601. The arbiter 601, to which packets transmitted from the SBs within the system are input, transmits the packets to the SBs while arbitrating them. In the SBs, the packets are input to the decoder 114, which then decodes the packets.
FIGS. 3 to 5 illustrate a DR method of the server taking the conventional physical partition form, and the problem with it. In FIGS. 3 to 5, the same components as those illustrated in FIG. 2 are denoted with the same reference numerals. In the descriptions of FIGS. 3 to 5 to be provided later, the same components as those of the SBs are denoted only with main numbers for the sake of convenience.
(I) Before an SB is Embedded
Assume that the SB 130 (SB#n) is newly embedded (added) into the partition 100 of the server that includes the partitions 100 (Partition#0) and 200 (Partition#1), as illustrated in FIG. 3. Each SB of each of the partitions includes two CPUs and one chip set. In this example, the CPU 112 within the SB 110 is a dual core CPU including two CPU cores (the spheres in FIG. 3). Also, the other SBs include a CPU having a similar configuration. Moreover, the server includes the cross-bar switch (Xbar) 600.
FIG. 3 illustrates the state before the SB 130 is added to the partition 100. As illustrated in FIG. 3, all the values of the registers of the chip sets in the SBs within the partition 100 match before the SB 130 is embedded into the partition 100. Namely, the value of the register 111R within the chip set 111 of the SB 110 (SB#0) and that of the register 121R of the chip set 121 within the SB 120 (SB#1) match. In contrast, the value of the register 211R within the chip set 211 of the SB 210 of the partition 200 and those of the registers 111R and 121R within the partition 100 mismatch. However, since the partitions are different, this mismatch is not a problem from a system viewpoint. Additionally, the CPU 132 within the SB 130 is put in a suspended state.
(II) During Procedures for Embedding the SB
In the state illustrated in FIG. 3, the SB 130 is embedded (added) into the partition 100 as illustrated in FIG. 4. The CPU 132 within the SB 130 is held in a suspended state when the SB 130 is embedded. In the initial state where the SB 130 is embedded into the partition 100, the values of the registers 111R and 121R within the chip sets 111 and 121 of the SBs 110 and 120 and that of the register 131R within the chip set 131 of the SB 130 do not match in the partition 100. However, since the CPU 132 within the SB 130 is being suspended, this is not a problem from a system viewpoint.
(III) Completion of Embedding the SB
Then, the operations of the CPU 132 within the SB 130 are started to complete the embedding of the SB 130 into the partition 100 as illustrated in FIG. 5. At this time, the value of the register 111R within the chip set 111 of the SB 110 and that of the register 121R within the chip set 121 of the SB 120 match. However, the value of the register 131R within the chip set 131 of the SB 130 and those of the above described registers do not match. Accordingly, it is possible for the server to be suspended during system operations.
As described above, DR of the server taking the conventional physical partition form has the problem wherein the server might enter a suspended state during system operations if an SB is newly embedded (added) into a partition.
In the meantime, the following techniques are known as techniques similar to the present invention.
The first known technique is the invention related to the connection verification method used at the time of dynamic reconfiguration of a computer system (see Patent Document 1).
The second known technique is the invention related to the technique for dynamically configuring an interconnection within a computer system. According to this invention, a predetermined condition of a trigger for reconfiguring a computer system is detected, and the mode of a signal path affected by the condition is dynamically reconfigured according to the detected condition (see Patent Document 2).
The third known technique is the invention related to the dynamic reconfiguration of a user interface of a functional module of a control platform (see Patent Document 3).
Patent Document 1: Japanese Laid-open Patent Publication No. H08-095820
Patent Document 2: Japanese Laid-open Patent Publication No. 2003-178044
Patent Document 3: Japanese Laid-open Patent Publication No. 2006-172483