The present invention relates generally to the field of computer systems and more particularly to a method to improve a server system's ability to execute an initial program load process when an initial program load fails.
Server systems such as enterprise servers play a crucial role in an enterprise information technology (IT) infrastructure. The configuration of these servers range from a single multicore processor accessing a few gigabits of memory to hundreds of multicore processors with terabits of memory. As the size of the server system increases, the time to configure and initialize the server increases proportionally.
In a typical server system initial program load (IPL) also known as booting up or “IPLing” a system, board management controllers or service processors initially check the condition of the server components before the application of power to the main or host server. In cases where the full diagnosis mode is enabled, completing an IPL, particularly in large server systems, is time consuming as each server component sequentially checks every hardware element before giving control to the next server component for hardware verification. In order to reduce downtime when an IPL fails, many server systems and especially, most high-end servers employ redundant hardware elements. Redundant hardware elements are typically incorporated within a server system that includes IPL critical hardware elements executing the instructions to initialize processors, busses such as memory busses, and memory devices that store hostboot firmware, kernel, and files systems enabling the IPL process. Recovery methods for a failing IPL include powering down the server system and re-starting the IPL using redundant hardware for the failed hardware elements.