The present invention generally relates to platform management of computer systems. More specifically, the present invention relates to the platform management of high-availability computer systems.
Modern integrated computer systems provide multiple services like voice and data transmission, system management, security, wireless communication, video conferencing, web services, etc. These computer systems are assembled, using various hardware and software components, better known in the industry as Commercial-off-the-shelf (COTS) components, which are sourced from multiple vendors. These computer systems provide continuous services to the users even if hardware and software faults occur or the COTS components are being upgraded. The Service Availability Forum (SAF), an industry consortium of telecommunication and computer equipment manufacturers and users, provides specifications that provide open standards for high-availability computer systems, including the Hardware Platform Interface (HPI) specification for platform management of computer systems.
Management software can enable the use of the COTS components to construct high-availability systems and services that provide uninterrupted services to the users. The management software allows users to set and retrieve configuration and operational data related to the COTS components. The management software can control the operation of the COTS components. Examples of operations may include starting up, shutting down, and testing of the COTS components. The management software typically accomplishes these functions by modeling a computer system, reflecting the current state of the computer system in that model, and providing an interface through which user application programs can inquire about the current state. Additionally, the management software can also update hardware platform of the computer system when the user application programs, through the interface provided by the software, update the state of the model.
The HPI specification provides structures for modeling the computer system in the form of sets of resources and domains. A resource is an abstract representation of a part of one or more parts of the computer system, the representation includes a set of management instruments and a set of management capabilities. The set of management instruments and the set of management capabilities are used for reflecting and changing the current state of hardware platform of the computer system. A domain is an abstract collection of resources. Each resource can be a member of one or more domains. Several resources can be used to model the management capabilities for hardware platform of computer system.
Each resource can include one or more management instruments and one or more management capabilities. Examples of the management instruments are sensors, controls, inventory data repositories, watchdog timers, and annunciators. Examples of the management capabilities are power control, reset control, configuration parameter control, hot swap management, event generation, and event log maintenance. For example, by accessing a management instrument recognized as a ‘control’ through an application program interface (API), a user-application program can change the configuration or operational parameters of the hardware platform of the computer system. However, the ‘control’ can also be set to an ‘automatic’ mode, as defined in the HPI specifications. When the user-application program cedes operation of the ‘control’ to a set of built-in autonomous functions in the hardware platform, the control is referred to be set in the automatic mode. While the HPI specification makes provisions for autonomous functions in the hardware platform, these functions are not defined in the specification. The HPI specification describes only the API for enabling the platform management of the computer systems.
Conventional mechanisms for platform management of high-availability computer systems may only provide the API for allowing user application programs to administer the hardware platform of the computer system. For high-availability computer systems, autonomous functions are required in the computer system to detect and react to system fault conditions. These autonomous functions may not have a standardized way of implementation. Further, the autonomous functions are not coupled and coordinated with the API.
In view of the foregoing discussion, there is a need for software for platform management of high-availability computer systems. The software for the platform management of the high availability computer systems should enable the administering of the hardware platform of the computer systems by means of the autonomous functions. Further, there is a need for the standardized provision of implementing the autonomous functions that can operate on the computer system. Furthermore, there is a need for software for platform management of the high-availability computer systems that can integrate both the API and the autonomous functions to administer the hardware platform.