1. Field of the Invention
This invention relates to management of power in blade computing systems.
2. Description of the Related Art
In the past, information handling systems, e.g., workstations, servers, etc. were essentially self-contained systems within an appropriate housing. For example, a desktop PC would consist of user interface elements (keyboard, mouse, and display) and a tower or desktop housing containing the CPU, power supply, communications components and the like. However, as demands on server systems and PC systems increased and with the increasing spread of networks and the services available through networks, alternate technologies have been proposed and implemented.
Blade computing is one such technology. A blade server provides functionality comparable to or beyond that previously available in a “free standing” or self-contained server, by housing a plurality of information handling systems in a compact space and a common housing. Each server system is configured to be present in a compact package known as a blade, which can be inserted in a chassis along with a number of other blades. At least some services for the blades, typically including power supply, are consolidated so that the services can be shared among the blades housed in common. As blade technology has advanced, blade architecture has been developed whereby servers are packaged as single boards and designed to be housed in chassis that provide access to all shared services. In other words, blade servers today are single board units that slide into a slot in a housing in which other like boards are also housed.
Similar to blade servers, desktop blades involve the configuration of the major components of a PC onto a single card, and then storing/housing many such cards in a single chassis or housing. As with server blades, the use of desktop blades allows centralized management and maintenance of power shared among the various blades.
In an IBM BladeCenter® and other blade/chassis systems, there are advantages to allowing the maximum possible density of blades within the chassis. Other than the size of the chassis itself, the only limitation on blade density is the amount of power consumed by the blades in the chassis. In a typical blade center system there are two power domains, each supported by two power supplies running in a shared, fully redundant mode. This two-supply system is considered fully redundant because if one of the supplies (the “non-redundant” or primary supply) fails, the other supply (the “redundant” or secondary supply) is of a size that allows it to provide sufficient power to fulfill the power demands of the entire domain. In other words, the “nominal power” of a single supply is sufficient to provide power for the entire domain. In practice, the power allocation is typically shared between the multiple supplies when all are functioning properly. When one of the power supplies fails, the portion of the power that it was providing is automatically shifted to the remaining supply.
As CPUs and other devices have increased their speed, their power demands have also increased. The aggregation of blades constructed with newer, more powerful and power-demanding CPUs may exceed the capacity of power provided by a single (non-redundant) power supply system, i.e., they may exceed the nominal power that can be provided by a single power supply; meanwhile the nominal power available by existing power supplies has not increased in a corresponding manner. While larger capacity power supplies could be utilized, space limitations within the chassis can be prohibitive. Thus, there is either a limit to the number of blades that can be used, or other power management strategies must be applied.
One solution has been to “oversubscribe” the number of blades available within the chassis, and utilize some of the spare capacity of the shared redundant power supplies for normal operations. Oversubscribing is the term used to describe the situation where aggregate power demand is greater than the non-redundant supply capacity (e.g., at nominal value of the power supply in a “1+1” redundant system, i.e., a system having one supply that can handle the complete load, plus one additional supply also capable of handling the complete load). In an oversubscription situation, the power needed to supply the subscribed blades will exceed the capacity of the non-redundant power supply and thus the power system is no longer fully redundant. This can threaten the overall operation of the system, since if a power supply failure occurs, the remaining supply may be overloaded and thus an entire domain of blades may not be able to remain operational.
Besides implementing a fully redundant policy, multiple levels of oversubscription can be defined. Recoverable-oversubscription is where the limit of power with redundant power supplies (recoverable-oversubscription limit) is greater than the power supply nominal value, but where recoverable action (e.g., throttling of blades) can be taken when a redundant supply is lost, such that the remaining power supply will not shut down. Non-recoverable-oversubscription is where the limit of power with redundant power supplies (non-recoverable-oversubscription limit) is greater than the power supply nominal value, but where sufficient recoverable action (e.g., throttling of blades) cannot be taken in a manner that will assure that the remaining power supply will not shut down.
When in oversubscription mode and a redundant power supply is lost, action must be taken very quickly to reduce the power demand or the power system will fail. One possible action is to power off one or more blades to thereby reduce the power demand. However, some blades are designed with programmable throttling such that their power consumption can be reduced, albeit with some loss of performance. It would be desirable to use this programmable throttling function for power reduction in the above-described situation when a redundant power supply is lost while operating in oversubscription. However, the chassis management entity in the prior art is not configured with sufficient information to enable power reduction via the programmable throttling. Different blades can have different mechanisms with different power reduction characteristics, and new blades may be released with new mechanisms and characteristics which would require an update to the chassis power management functions for them to be utilized by the chassis management entity when effecting power reduction.
Accordingly, it would be desirable to have a blade power management system whereby the amount of power reduction that a blade can withstand and still function was determined by the blade and utilized when determining which blades to reduce in power and by how much. Additionally, it would be desirable to provide a mechanism whereby the power is reduced within a very short window where the remaining power supply (or remaining power supplies) will provide the excess power needed for only this short period of time.