Unix is a registered trademark of The Open Group. SCO and Unixware are registered trademarks of The Santa Cruz Operation, Inc. Microsoft, Window, Window NT and/or other Microsoft products referenced herein are either trademarks or registered trademarks of Microsoft Corporation. Intel, Pentium, Pentium II Xeon, Pentium III Xeon, Merced and/or other Intel products referenced herein are either trademarks or registered trademarks of Intel Corporation.
This invention relates to multiprocessing data processing systems, and more particularly to symmetrical multiprocessor data processing systems that have a clustered processor architecture. More specifically, the present invention relates to a method and apparatus for booting such clustered multiprocessor systems, and for initiating execution of selected application processors within such systems.
Systems having multiple but coordinated processors were first developed and used in the context of mainframe computer systems. More recently, however, interest in multiprocessor systems has increased because of the relatively low cost and high performance of microprocessors, with the objective of replicating mainframe performance through the parallel use of multiple microprocessors.
A variety of architectures have been developed including a symmetrical multiprocessing (xe2x80x9cSMPxe2x80x9d) architecture, which is used in many of today""s workstation and server markets. In SMP systems, the processors have symmetrical access to all system resources such as memory, mass storage and I/O.
The operating system typically handles the assignment and coordination of tasks between the processors. Preferably the operating system distributes the workload relatively evenly among all available processors. Accordingly, the performance of many SMP systems may increase, at least theoretically, as more processor units are added. This highly sought-after design goal is called scalability.
One of the most significant design challenges in many multiprocessor systems is the routing and processing of interrupts. An interrupt may generally be described as an event that indicates that a certain condition exists somewhere in the system that requires the attention of at least one processor. The action taken by a processor in response to an interrupt is commonly referred to as the xe2x80x9cservicingxe2x80x9d or xe2x80x9chandlingxe2x80x9d of the interrupt. Each interrupt typically has an identity that distinguishes it from the others. This identity is often referred to as the xe2x80x9cvectorxe2x80x9d of the interrupt. The vector allows the servicing processor or processors to find the appropriate handler for the interrupt. When a processor accepts an interrupt, it uses the vector to locate the entry point of the handler in a pre-stored interrupt table.
In some multiprocessor systems, a central interrupt controller is provided to help route the interrupts from an interrupt source to an interrupt destination. In other systems, the interrupt control function is distributed throughout the system. In a distributed interrupt control architecture, one or more global interrupt controllers assumes global, or system-level, functions such as, for example, I/O interrupt routing. A number of local interrupt controllers, each of which is associated with a corresponding processing unit, controls local functions such as, for example, inter-processor interrupts. Both classes of interrupt controllers typically communicate over a common interrupt bus, and are collectively responsible for delivering interrupts from an interrupt source to an interrupt destination within the system.
The Intel Corporation published a Multiprocessor (MP) specification (version 1.4) outlining the basic architecture of a standard multiprocessor system that uses Intel brand processors. Complying with the Intel Multiprocessor (MP) specification may be desirable, particularly when Intel brand processors are used. According to the Intel Multiprocessor (MP) Specification (version 1.4), interrupts are routed using one or more Intel Advanced Programmable Interrupt Controllers (APIC). The APICs are configured into a distributed interrupt control architecture, as described above, where the interrupt control function is distributed between a number of local APIC and I/O APIC units. The local and I/O APIC units communicate over a common bus called an Interrupt Controller Communications (ICC) bus. There is one local APIC per processor and, depending on the total number of interrupt lines in an Intel MP compliant system, one or more I/O APICs. The APICs may be discrete components separate from the processors, or integrated with the processors.
The destination of an interrupt can be one, all, or a subset of the processors in the Intel MP compliant system. The sender specifies the destination of an interrupt in one of two destination modes: physical destination mode or logical destination mode. In physical destination mode, the destination processor is identified by a local APIC ID. The local APIC ID is then compared to the local APIC""s actual physical ID, which is stored in a local APIC ID register within the local APIC. A bit-wise definition of the local APIC ID register is shown in FIG. 1. The local APIC ID register is loaded at power up by sampling configuration data that is driven onto pins of the processor. For the Intel P6 family processors, pins A11# and A12# and pins BR0# through BR3# are sampled. Up to 15 local APICs can be individually addressed in the physical destination mode.
The logical destination mode can be used to increase the number of APICs, and thus, processors, that can be individually addressed by the system. In the logical destination mode, message destinations are identified using an 8-bit message destination address (MDA). The MDA is compared against the 8-bit logical APIC ID field of the APIC logical destination register (LDR). A bit-wise definition of the logical destination register is shown in FIG. 2.
A Destination Format Register (DFR) is used to define the interpretation of the logical destination information. A bit-wise definition of the destination format register is shown in FIG. 3. The DFR register can be programmed for a flat model or a cluster model interrupt delivery mode. In the flat model delivery mode, bits 28 through 31 of the DFR are programmed to 1111. The MDA is then interpreted as a decoded address. This delivery mode allows the specification of arbitrary groups of local APICs by simply setting each APIC""s corresponding bit to 1 in the corresponding LDR. Broadcast to all APICs is achieved by setting all 8 bits of the MDA to one. As can be seen, the flat model only allows up to 8 local APICs to coexist in the system. FIG. 4 is a block diagram of an illustrative multiprocessor system connected in accordance with the flat model delivery mode described in the Intel Multiprocessor (MP) specification (version 1.4).
For the cluster model delivery mode, the DFR bits 28 through 31 are programmed to 0000. In this delivery mode, there are two basic connection approaches: a flat cluster approach and a hierarchical cluster approach. In the flat cluster approach, it is assumed that all clusters are connected to a single APIC bus (e.g., ICC bus). FIG. 5 is a block diagram of an illustrative multiprocessor system connected in accordance with the flat cluster model delivery mode described in the Intel Multiprocessor (NV) specification (version 1.4). In this mode, bits 28 through 31 of the MDA contain the encoded address of the destination cluster. These bits are compared with bits 28 through 31 of the LDR (see FIG. 2) to determine if the local APIC is part of the cluster. Bits 24 through 27 of the MDA are compared with Bits 24 through 27 of the LDR to identify the individual local APIC unit within the selected cluster.
Arbitrary sets of processors within a cluster can be specified by writing the target cluster address in bits 28 through 31 of the MDA and setting selected bits in bits 24 through 27 of the MDA, corresponding to the chosen members of the cluster. In this mode, 15 clusters (with cluster addresses of 0 through 14), each having 4 processors, can be specified in a message. The APIC arbitration ID, however, only supports 15 agents, and hence the total number of processors supported in the flat cluster mode is limited to 15.
The hierarchical cluster approach allows for an arbitrary hierarchical cluster network to be created by connecting different flat clusters via independent APIC buses. FIG. 6 is a block diagram of an illustrative multiprocessor system connected in accordance with the hierarchical cluster model delivery mode described in the Intel Multiprocessor (MP) specification (version 1.4). According to the MP specification, this mode requires a special cluster manager device within each cluster to handle the messages that are passed between clusters. The required special cluster manager devices are not part of the local or I/O APIC units. Instead, they are separately provided. In the hierarchical cluster mode, one cluster may contain up to 4 agents. Thus, when using 15 special cluster managers connected via a single APIC bus (e.g., ICC bus), each having 4 agents, a network of up to 60 APIC agents can be formed.
A problem that may occur when using the hierarchical cluster mode is that the state of the DFR register, shown in FIG. 3, returns to all ones after a power-up reset, or after the execution of an INIT inter-processor interrupt (INIT IPI) instruction. As indicated above, when the DFR register is set to all ones, the logical destination mode register, shown in FIG. 2, is interpreted to be in a flat model delivery mode, which has a maximum configuration of 8 local APICs. This may present a problem when booting the system, and/or when initiating execution of application processors by the operating system from a halted state, as more fully described below.
In an Intel MP compliant system, one of the processors is designated as the bootstrap processor (BSP) at system initialization by the system hardware or by the system hardware in conjunction with the BIOS. The remaining processors are designated as application processors (APs). The BSP is responsible for booting the operating system and initiating execution of the APs.
According to the Intel MP Specification, the APs are in a halted state with interrupts disabled when the first instruction of the operating system is executed by the BSP. Thus, each of the local APICs of the APs are passively monitoring the APIC bus (ICC bus), and react only to the INIT or STARTUP interprocessor interrupt (IPIs) messages.
An INIT IPI is an inter-processor interrupt which causes the local APIC addressed by the INIT IPI message to initialize or reset its corresponding processor. This causes the processor to reset its state and begin executing at a fixed location, which is the reset vector location.
STARTUP IPIs are used with systems based on Intel processors with local APIC versions of 1.x or higher, which can recognize the STARTUP IPI message. The STARTUP IPI message causes the target processor to start executing in Real Mode from address 000VV000h, where VV is an 8-bit vector that is part of the STARTUP IPI message. Startup vectors are limited to a 4-kilobyte page boundary in the first megabyte of the address space. STARTUP IPIs do not cause any change of state in the target processor (except for the change to the instruction pointer), and can be issued only one time after RESET or after an INIT IPI reception or pin assertion.
According to the Intel MP Specification, the operating system typically causes the APs to start executing their initial tasks in the operating system code using the following algorithm.
BSP sends AP an INIT IPI
BSP DELAYs (10 mSec)
If (APIC_VERSION is not an 82489DX)
{
BSP sends AP a STARTUP IPI
BSP DELAYs (200 xcexcSEC)
BSP sends AP a STARTUP IPI
BSP DELAYs (200 xcexcSEC)
}
BSP verifies synchronization with executing AP
The INIT IPI must be executed before the STARTUP IPI message to get the target AP out of the halt state. This is shown in the pseudo code above. As indicated above, however, the INIT IPI message causes the logical destination register, as shown in FIG. 2, to switch to the flat model delivery mode, which has a maximum configuration of 8 local APICs. For those systems that are constructed in accordance with the cluster model delivery mode, this can disrupt the addressing used to identify the local APICs. A similar problem may occur after power up reset.
What would be desirable, therefore, is a method and apparatus for initiating execution of selected application processors in a clustered multiprocessor system without disrupting the addressing of the local APICs. What would also be desirable is a method and apparatus for booting such clustered multiprocessor systems. Finally, what would be desirable is an application processor that does not switch addressing modes when an INIT IPI or power up reset is executed.
The present invention overcomes many of the disadvantages of the prior art by providing a method and apparatus for initiating execution of selected application processors in a clustered multiprocessor system without disrupting the addressing mode of the local APICs. In one embodiment, this is accomplished by first initializing the processors, including the AP processors, during a boot routine. This can be done using the INIT IPI message, causing the logical destination registers of the APICs to switch to a flat model. Then, for each cluster, the BSP may broadcast a STARTUP IPI message, which redirects the program flow of each AP to a common initialization procedure. The initialization procedure may assign a specific APIC logical ID to each of the processors, switch the processors from physical addressing mode to logical addressing mode. The initialization procedure may also leave each of the processors of the cluster in an active mode spinning on a predetermined safe memory space (e.g., startup address location). The predetermined safe memory space preferably has a processing module ID section and a startup address code field.
During use, each of the APs compare their own pre-assigned processing module ID with the processing module ID stored in the processing module ID section of the predetermined safe memory location. If their pre-assigned processing module ID matches the processing module ID, the corresponding AP jumps to the startup address specified in the startup address code section of the selected memory location to begin execution. Accordingly, the operating system may initiate execution of any one of the APs at a selected startup address by writing a matching processing module ID into the processing module ID section and a desired startup address of the selected memory location.
It is contemplated that the predetermined safe memory location (e.g., Startup Address Location) may further have a valid flag section. The operating system may write a valid flag (e.g., a one) into the valid flag section when initiating execution of one of the APs. The AP with the matching processing module ID preferably resets the valid flag section after jumping to the desired startup address.
Finally, rather than attempting to avoid disruption of the addressing mode when switching out of a halt state, the present invention also contemplates providing a processor that does not change addressing modes when switched from the halt state. This may eliminate the need for the processor to spin on a predetermined safe memory location, as described above. Accordingly, the overall design and operation of a multiprocessor system may be simplified.