1. Technical Field
The present invention relates in general to the field of computers, and in particular to multiple blade servers housed in a server chassis. Still more particularly, the present invention relates to a method and system for a remote management module taking over operation of a server chassis if the server chassis' internal management modules fail.
2. Description of the Related Art
Server blade computers offer high-density server boards (blades) in a single server blade chassis (blade center chassis). A typical server blade computer is illustrated in FIG. 1, identified as server blade chassis 102. Server blade chassis 102 includes multiple hot-swappable server blades 104a-n. There are typically fourteen server blades 104 in server blade chassis 102. The operations of server blades 104 are coordinated by logic identified as management modules 108, each of which typically includes a processor for controlling input/output (I/O) functions, interfacing with a network 106 (such as the Internet or a Local Area Network), and allocating jobs and data to the different server blades 104. Typically, a first management module 108a is designated as the primary management module, and a second management module 108b is a back-up to be used if the primary management module 108a should fail.
Another function of management module 108 is to control a power module 110 and cooling fans 112. The coordinated management of power module 110 and cooling fans 112 permits the server blades 104 to operate within their designed temperature ranges. That is, additional power demands from power module 110 generally translate into additional cooling demands from cooling fans 112, requiring them to be turned on or operated at a higher speed. Failure to properly control the cooling of server blade chassis 102 and the server blades 104 contained therein can be catastrophic, including permanent damage to the server blades 104 and circuitry associated with server blade chassis 102. Thus, if both management modules 108 should fail, then the temperature control is lost.
What is needed, therefore, is a method and system for providing a remote failover management module. Preferably, the failover management module would be activated by the detection of a failure of the server blade chassis' internal management module(s), followed by the reallocation of external management module resources.