1. Field
The disclosure relates generally to data processing systems and to diagnostic methods and systems therefore, and more specifically to multinode server systems and to diagnostic methods and systems for selecting a primary server and dropping failed servers from such a multinode system at system reset.
2. Description of the Related Art
In a multinode data processing system, a plurality of processor nodes are coupled together in a desired architecture to perform desired data processing functions during normal system operation under control of a multinode operating system. For example, such a multinode system may be implemented so as to distribute task processing in a desired manner across multiple processor nodes in the multinode system, thereby to implement parallel processing or some other desired processing scheme that takes advantage of the multinode environment.
Each server node in a multinode system may include one or more processors with associated hardware, firmware, and software to provide for necessary intra-node functionality. Such functionality includes a process for booting or starting up each server node from reset. This process may be used, for example, when power first is applied to the server nodes of the multinode system. As part of this process, a node processor begins to read and execute start-up code from a designated memory location. The first memory location the processor tries to execute is known as the reset vector. The reset vector typically is located in a startup memory region shadowed from a read only memory (ROM) device. The startup memory is coupled to the processor via one or more devices and connections that form a code fetch chain. For example, in a multinode data processing system, each node may include a central processing unit (CPU) that is coupled to a startup flash memory device via the code fetch chain. The start-up code includes basic input/output system (BIOS) code that is stored in the startup flash memory and retrieved by the CPU via the code fetch chain at system reset.
At some point during, or following, the reset procedure implemented in each node, the multinode environment itself is configured. This process may include confirming one of the server nodes preselected to perform the designated functions of a primary server node, dropping server nodes from the system that fail to boot properly, and reconfiguring the multinode system as necessary in response to the dropping of failed nodes, if any. This latter procedure may include selecting a new primary node from among the available secondary nodes, if the originally designated primary node fails to boot properly. It is, of course, desirable that the system reset process, from power on through system configuration to a fully configured and operable multinode system, be implemented efficiently, with a minimum of manual user intervention required.
In a conventional multinode boot flow from reset, at power up each server node independently begins fetching startup code from its startup memory, as discussed above. All of the server nodes boot up to a certain point in the start-up process. For example, all of the server nodes may boot up to a designated point in a pre-boot sequence power-on self-test (POST). Startup code on one of the server nodes, designated in advance by a user as the primary node, then merges all of the nodes to look like a system from there on and into a multinode operating system boot. If, during this reset process, a server node fails to boot properly to the required phase of the start-up, the failed server node would not be merged into the multinode system The designated primary node simply would timeout waiting on the failed node to boot. If the node that fails to boot properly is the designated primary node, a user typically will have to work with a partition user interface and manually dedicate a new server as the primary node server.
In a more recently developed boot flow process, for new high end multinode systems, only one server node in the multinode system begins to fetch startup code at reset, and that node will be the primary node. This designated primary node will execute code and will configure all of the other nodes and present the multinode system as a single system to the multinode operating system. This new approach to the reset process in a multinode system presents several unique challenges, in addition to the known challenges associated with reset of a multinode data processing system in general.
It is desirable to detect server node failures as early in the reset process as possible. It is also desirable to drop off a failed primary node, and other failed nodes, from the multinode system as soon as possible. Furthermore, if a failed primary node is dropped, it is desirable as soon as possible to make a different server node in the multinode system, one that will boot properly, into the primary node. However, since, in the new reset approach described above, only one server node is executing startup code at reset, detecting node failure by detecting a failure to boot properly in the normal manner cannot be used as a diagnostic method for detecting and dropping off nodes from the multinode system other than the designated primary node. Furthermore, since the primary node is the only server node that is to be executing startup code, secondary node boot processes normally will have to be inhibited. For example, under such a reset scheme, baseboard management controllers (BMCs) in the code fetch chains of the secondary nodes may have to be instructed not to start automatic BIOS recovery (ABR) at reset. Also, it is desirable that any necessary repartitioning of the multinode system to select a new primary server node be accomplished with minimal or no manual user intervention.