The present invention relates to the power-on initialization steps in data processing systems, particularly in computer systems.
The exponential growth of computing requirements has resulted in the creation of large complex computer systems. Power-on and initialization of these large computer systems up to the point at which normal operating systems are fully available often rely on embedded controllers that facilitate the initialization of a computer system at power-on. An example for such a complex control structure of a computer system is described in F. Baitinger et al “System control structure of the IBM eServer z900”, IBM J. Res. & Dev., Vol. 46, No. 4/5, 2002, pp. 523-535.
In such a computer system, the hardware is packaged in multiple hardware “cage enclosures”, the so-called cages. The embedded controllers are divided in two categories: support elements and cage controllers. The support elements are optionally redundant with a primary support element and a secondary support element. There is one primary support element in the computer system. The cage controllers are redundant with a master cage controller and a slave cage controller. A cage controller is associated with an entire hardware cage of the computer system. It interacts with the primary support element via a private service control network. In addition, the service control network may be redundant.
The primary support element controls the initialization of the computer system (system control scope), whereas the cage controllers perform the actual system control and monitoring tasks for their associated cages (intra-cage control scope). The cage controller behaves as a proxy between the service element and the actual hardware. The service element maintains and accesses engineering data which describes the initial values for all hardware registers. The power-on initialization steps mainly consist of an IML (Initial Machine Load). Further details of IML steps and their relation to service elements and cage controllers can be found in K.-D. Schubert et al “Accelerating system integration by enhancing hardware, firmware, and co-simulation”, IBM J. Res. & Dev., Vol. 48, No. 3/4, 2004, pp. 569-581 (with so-called Flexible Support Processors serving as cage controllers).
This system control structure implies an inherent parallelism for the operation of the cage controllers. It is implemented as a master-slave operating model with the (primary) support element operating as the master and the (master) cage controllers operating as the slaves. However, there are dependencies between the operations of the cage controllers that must be reflected in the operation of the system control tasks performed on the support element. These dependencies are caused by the fact that hardware in the various cages does not work independently from each other.
These dependencies are managed by the hardware object model (HOM) described in A. Bieswanger et al “Hardware configuration framework for the IBM eServer z900”, IBM J. Res. & Dev., Vol. 46, No. 4/5, 2002, pp. 537-550. The HOM is used as part of the system control firmware executed on the support elements. Because packaging of functionality provided by hardware components changes with every new computer system platform, computer system hardware components and their functionality are separated in the design of the HOM. It can be controlled at its startup time via configuration rules that are stored in a rule database, which are specific to a computer system. The HOM allows centrally controlling the various IC (Integrated Circuit) chips in the computer system. Since the computer system supports hot-plugging of hardware and since hardware subsystem can be broken and need to be isolated in the system configuration, the HOM needs to be changed dynamically in order to reflect the current state of the computer system.
An example of an HOM is the eServer z900 HOM described in A. Kreissig/J. Armstrong “A Common Multi-Platform Hardware Object Model”, ACM OOPSLA 2002 Practitioners Report, ISBN 1-58113-471-1. This HOM inherits some of the design patterns of the eServer z900 HOM. Further details of the actual implementation of the computer system specific HOM configuration are provided in the patent application US 2005/0086637 A1.
With the growing number of chips in complex computer systems (e.g., systems with Multi-Chip Modules), especially in high-end server computer systems, the task of controlling these chips can become time intensive. Even within such chips, multiple subsystems can be operated independently to some extent (e.g., processors supporting SMT—Simultaneous Multi-Threading). The problem may worsen for hardware designs that do not support broadcasting of operations to multiple chips at once.
For example, the IBM eServer z900 has a central clock chip that is connected to all the chips of the functional system structure (in differentiation to the system control structure). This is shown for example in L. C. Alves et al “RAS Design for the IBM eServer z900”, IBM J. Res. & Dev., Vol. 46, No. 4/5, 2002, pp. 503-521 (especially FIG. 2). Besides feeding the connected chips with clock signals, the clock chip controls their status as well. Via this clock chip control mechanism, it is possible for a cage controller to address multiple chips at once in order to perform chip control operation or monitor chip status. A detailed implementation for such a method is described in the patent application US 2006/0106556 A1. Other server computer systems might not have such a central clock chip, e.g. IBM System i/System p machines. Instead, every chip is provided with its own clock hardware logic in order to save costs for the additional clock chip hardware.
In system control structures similar to the IBM eServer z900, the cage controllers alone may provide limited parallelization options only. This results from the notion that many computer system relevant data is available to the service elements only, e.g. the engineering data. Consequently, the service element needs to be involved in many parallel operations.
Therefore, parallel operations for controlling chips as part of computer system control operations are highly desired. Typically, parallelization may be performed by changing the design of a HOM and therefore its actual implementation, or by adapting the HOM configuration rules. For example, it is possible to initialize central processing units (CPUs) before initializing their associated memory chips. But this does not fully exploit the capabilities to parallelize operations as both initializations can overlap to some extent (at some point a CPU will access its associated memory chips, but needs to be configured accordingly in order to do so, e.g. via certain register settings).