The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for performing multi-chip initialization using a parallel firmware boot process.
Booting a multi-chip system with tightly coupled processors is a substantial exercise in server system designs. Since many of the hardware elements of the multi-chip system are not initialized yet, e.g., memory or inter-processor buses, mechanisms which are used by operating systems to create parallelism cannot be used by the initializing firmware immediately after power-on of the multi-chip system. Thus it is not uncommon today to perform the initialization of a multi-chip system utilizing only a single task that is either running on a service processor or running on one of the processors of the multi-chip system. This serializes the boot process making it a relatively slow process. This approach does not scale when increasing the number of the involved processor chips.
For example, multi-chip initialization in a symmetric multiprocessor (SMP) architecture is traditionally performed in one of two ways. In one methodology, the whole process is performed by an external service processor in an out-of-band fashion. In these SMP architecture designs, a central resource which is external to the processor is used to configure all of the chip-components such that they are configured correctly with unique addresses on the inter-processor bus. The initialization can be parallelized and sped up to the degree that the external service processor is able to provide. This model is predominant in many server architecture designs.
In other designs solutions one of the plurality of processors is assigned the role of a master and does the majority of the boot process alone via the inter-processor bus, which is configured to support early low-speed operation right from power-on. Here again, however, the possibility for parallelism in the boot process is very limited due to the initialization operations of the boot process being primarily performed by a single processor.
There are also known architectures where portions of the system are set up to operate in parallel at initialization time. However, these implementations suffer certain disadvantages. In order to parallelize the boot process, each processor needs a customized version of the basic input/output system (BIOS) code that is customized to account for the position of each chip in the multi-chip configuration, especially with regard to system addresses. This can either be addressed via time-consuming address relocation of the firmware images during runtime, which slows down the boot process, or the need for multiple firmware images stored in the boot memory device(s), which requires a larger amount of storage space. These implementations also need one or more synchronization points between the parallel boot phases, which slows down the boot process as well. Another significant disadvantage of such approaches is an increase in complexity for firmware updates as well as more complex maintainer-ship of the code releases by development teams.