Many modern datacenters use environmental maintenance systems including heating, ventilation, and air conditioning (HVAC) units to control indoor temperature, humidity, and other variables. It is common to have many HVAC units deployed throughout a data center. They are often floor-standing units, but may be wall-mounted, rack-mounted, or ceiling-mounted. The HVAC units often provide cooled air to a raised floor plenum, to a network of air ducts, or to the open air of the data center. The data center itself, or a large section of a large data center, typically has an open-plan construction (i.e., no permanent partitions separating air in one part of the data center from air in another part). Thus, in many cases, these data centers have a common space that is temperature-controlled and humidity controlled by multiple HVAC units.
HVAC units for data centers are typically operated with decentralized, stand-alone controls. It is common for each unit to operate in an attempt to control the temperature and humidity of the air entering the unit from the data center. For example, an HVAC unit may include a sensor that determines the temperature and humidity of the air entering the unit to align with set points for that unit.
For reliability, most data centers are designed with an excess number and capacity of HVAC units. Since the open-plan construction allows free flow of air throughout the data center, the operation of one unit can be coupled to the operation of another unit. The excess units and capacity, and the fact that they deliver air to substantially overlapping areas provides a redundancy, which ensures that if a single unit fails, the data center equipment (servers, routers, etc.) will still have adequate cooling.
However, the level of redundancy is rarely uniformly distributed across a data center. For example, some areas of a data center may have a higher amount of heating load, because there are more servers in those areas of the data center, the servers generate more heat (e.g., because they are often run at high utilization), or some combination thereof. In addition, some areas of a data center may have less effective cooling (e.g., because there are fewer or lower capacity HVAC units nearby). If the reliability of a data center is treated atomically, this may lead to over-representing risk and increasing cooling equipment and energy costs, or under-representing risk and introducing the possibility of a catastrophic failure.
Therefore, it is desirable to provide methods and systems that can quantitatively represent the level of reserve and risk at various locations in a data center or another environmentally controlled space.