1. Technical Field
This invention relates generally to service processors as may be found within nodes of a system, and more particularly to the self-clustering of such service processors, so that, for instance, a single image of the service processors appears to the operating system of the system and/or a management console for the system.
2. Description of the Prior Art
As computer systems, such as server systems, become more complex, they have been divided into different nodes that operate as separate units. Each node may have its own processors, memory, and input/output (I/O) modules. Functionality performed by a system may be divided among its various nodes, such that each node is responsible for one or more different functions. Each node has a service processor (SP) that functions independently of the service processors of the other nodes, but which allows external access to the hardware of its node.
A complication to dividing a system into different nodes is that the operating system (OS) running collectively on the system, and the management consoles used to manage the system externally, have traditionally had to be aware of the specific aspects of this division into nodes. The operating system, for instance, has to know which service processor is responsible for which hardware and functionality of the system, so that messages can be routed to the appropriate service processor. Similarly, management consoles have to know the mapping of the service processors to the system's hardware and functionality.
This adds increased complexity to the operating system and the management consoles. Significant configuration may have to be accomplished to ensure that the operating system and the consoles are properly aware of the different service processors and the functions that have been assigned to them. Furthermore, like all system components, service processors sometimes fail. To ensure that the system itself does not fail, another service processor may have to temporarily act as the failover service processor for the down service processor. The operating system and the consoles must be aware of such failover procedures, too. Load balancing and other inter-service processor procedures also require knowledge of the distribution of functionality over the service processors.
In addition, traditional communication between an operating system and the service processors of the system occurs within the firmware of the system. Firmware is software that is stored in hardware, such that the software is retained even after no power is applied to the hardware. The use of conventional firmware, however, degrades performance significantly. For instance, firmware is not re-entrant. That is, only one service processor can execute the firmware at a single time. This means that the firmware may present a bottleneck to the efficient running of the system.
In other contexts, the management of multiple resources is accomplished on a simplistic basis. For example, in the context of storage devices, such as hard disk drives, a redundant array of information disks (RAID) provides for limited interaction among resources. A RAID may be configured so that each hard drive redundantly stores the same information, that data is striped across the hard drives of the array for increased storage and performance, or for additional or other purposes. However, the drives themselves do not actively participate in their aggregation. Rather, a master controller is responsible for managing the drives, such that the drives themselves are not aware of one another.
Therefore, such solutions are not particularly apt in the system division of functionality and hardware over multiple service processors scenario that has been described. For example, having a master controller in this scenario just shifts the burden of knowing the functionality and hardware division from the management consoles and the operating systems to the controller. This does not reduce complexity, and likely does not prevent reductions in system performance.
Other seemingly analogous resource management approaches have similar pitfalls. Network adapters that can be aggregated to provide greater bandwidth, for instance, are typically aggregated not among themselves, but by a host operating system and/or device driver. This host operating system and/or device driver thus still takes on the complex management duties that result when multiple resources are managed as a single resource. In other words, complexity is still not reduced, and potential performance degradation is still not prevented.
For these described reasons, as well as other reasons, there is a need for the present invention.