A data center typically contains electronic equipments such as servers, telecom equipments, networking equipments, switches and other electronic equipments which are arranged on racks or frames. The heat generated by such electronic components is cooled with the help of cooling units. Typically, the cooling units are computer room air conditioners (CRAC) or computer room air handlers (CRAH) which supply cold air for cooling. More advanced cooling units such as in-row coolers, rear door coolers, liquid cooled cabinets and chip cooling techniques have now come into practice.
Data centers are considered as energy guzzlers. With the drastic increase in energy cost, the huge energy consumption is one of the major concerns of the data center managers. Power consumed by cooling equipments contributes to a major portion of the total data center power consumption. The main challenge is to ensure safety of electronic equipments by ensuring appropriate temperatures in the data center and at the same time ensuring optimum cooling efficiency of the data center. Due to poor design of the data center, data center mangers may face a lot of problems such as hot spots, low tile flow rates and so on. General measures which can be taken to handle the problem are decreasing supply temperature of cooling units, increasing cooling capacity near the problem area and so on. These measures may decrease the cooling efficiency. Cooling capacity of the data center is designed and typically run for maximum heat load conditions. In practice, data centers rarely operated at maximum conditions. Typically, cooling units are controlled according to heat loads in a very elementary manner. Workload placement decisions are taken without considering cooling related issues such as cooling availability, inlet temperatures and so on. All these practices lead to poor cooling efficiency. In addition, due to the obvious urge to increase space utilization of the data center, consolidation and virtualization exercises are being carried out. Existing cooling infrastructure of a data center may not be sufficient to take concentrated heat loads resulting due to consolidation and virtualization. Data center managers face number of such major challenges in thermal management of data centers.
Various attempts are being made to minimize the cooling costs of the data center. Some of these attempts include transformation of old data centers, efficient design of new data centers, dynamic controlling of cooling units, consolidation and temperature aware workload scheduling. Efficient design of data center includes proper arrangement of racks, tiles, CRACs etc., adequate plenum depth, use of airflow management techniques such as aisle containment, CRAC operation moderation etc. Problems which are being faced in old data centers such as hot spots and low cooling efficiency have been solved by carrying out design and operational changes. Efficient control schemes have been developed to control parameters of CRAC such as supply temperature, supply flow rate or tile flow rates in accordance to changes in heat generation, temperature, pressure and airflows in the data center. These control schemes make CRAC run at optimum efficiency while maintaining satisfactory temperatures in the data center at the same time. Different algorithms for workload placements have been developed which take cooling related parameters such as recirculation, CRAC capacities into account while executing placement requests. Numerical models such as CFD models and data based models using neural networks are being used to facilitate these attempts.
All these attempts demand complete understanding of cooling characteristics of the data center. This determination of cooling characteristics may include quantification of hot air recirculation, cold air short-circuiting, loading of each CRAC, influence region of CRAC etc. Causal relationship between various components in the data center is also essential. For example, design optimization of a data center is often carried out to minimize the wastage of cold air due to short circuiting. Each CRAC would be receiving hot air as well as short-circuited cold air. In such scenario, it would be of value to know, for example, the individual contributions of various CRAC units in the short circuited air reaching a particular CRAC. In another case, it may be of use to know the contribution of a particular CRAC in overall flow at a cold tile. Hence, it is necessary to quantify various phenomena occurring in the data center and set a causal relationship between various components in the data center with respect to influence of one component on the other.
To minimize of the cooling cost there is a need for complete understanding of cooling characteristics of the data center. However, various performance metrics have been proposed to analyze cooling characteristics of the data centers. But they are not efficient enough to characterize the cooling profile of the data center. Some of the systems and methodologies which form the prior art are given below:
U.S. Pat. No. 7,051,946B2 to Bash, et al. discloses an air re-circulation index. The patent discusses about an index of air re-circulation in a data center having one or more racks. The index is used to determine the level of hot air re-circulation into cold air streams, delivered to one or more racks. The utility of the index has been shown in design optimization of the data center. The same air-recirculation index has been discussed by Sharma, at al. In “Dimensionless parameters for evaluation of thermal design and performance of large-scale data centers” and by Schmidt, at al in “Challenges of data center thermal management” The air re-circulation index has very limited scope. A data center may have various components such as CRAC, tile, sensors etc and various phenomena other than hot air recirculation associated with them such as cold-air short circuiting etc. Due to limited information about cooling characteristics given by air re-circulation index, design optimization carried out based on air re-circulation index alone may not provide the best design.
US20080174954A1 to VanGilder, et al. provides a system and method for evaluating equipment rack cooling performance. The problem addressed particularly relates to defining capture index (CI) as airflow based index which is based upon airflow patterns associated with the supply of cold air to, or the removal of hot air from, a rack. CI also has very limited scope. A data center may have various components such as CRAC, tile, sensors etc and various phenomena other than hot air recirculation associated with them such as cold-air short circuiting etc. The use of CI typically demands division of the data center into clusters and typically considers only local cooling devices. The CI is defined in terms of flow rates only and does not consider temperature or heat. CI has been shown typically applicable for design optimization of data center. It can also be used for identifying the best places for addition of new heat load and predicting temperature at some locations such as return air temperature at coolers. These applications are based on limited information about cooling characteristics given by CI hence may not provide best design.
US2010076608A to Nakajima, et al. provides a system and method for controlling air conditioning facilities, and system and method for power management of computer room. The problem addressed particularly relates to control cooling in data center in order to minimize cooling power. Further it is concerned with characterizing cooling in the data center wherein, some temperature sensitivity coefficients are defined. These temperature sensitivity coefficients also have limited scope. They typically quantify correlation between CRAC and racks and try to quantify cooling characteristics of the data center. Further, their utility is limited to controlling of cooling provisioning in the data center only.
Tang et al. In “Sensor-based fast thermal evaluation model for energy efficient high-performance datacenters” discloses cross-interference coefficients which quantify hot air exchange between various servers. The indices are used for fast temperature prediction at inlets of the servers for different power consumption profiles of the servers. These coefficients cannot be used for predicting temperatures at other locations in the data center. Moreover, these coefficients cannot be used to predict temperatures at inlets of servers for different operating parameters of CRAC such as CRAC supply temperature, CRAC flow rate etc.
U.S. Pat. No. 7,620,613B to Moore, et al, discloses thermal management of data centers. The problem addressed particularly relates to the estimation of temperatures at inlets of equipments from temperatures detected inside the equipments and heat generated by the equipments. Further it generates models for predicting temperatures at inlets of racks from heat generated and temperature sensor placed inside a server. So the methodology used for this prediction is from inside the servers. This methodology doesn't involve detailed flow and thermal computation in the whole data center. Hence this methodology fails to quantify temperatures of other locations in the data center or for different operating parameters of CRAC such as CRAC supply temperature, CRAC flow rate etc.
US20090150123A1 to Archibald, et al. provides a method of designing the data center using a plurality of thermal simulators. The problem addressed particularly relates to simulating conditions in the data center by physically laying out thermal simulators, measuring actual temperatures in this simulated environment and checking whether the proposed layout is suitable. Further it is concerned with defining some gross and preliminary indices such as average rack temperature difference etc. This method determined cooling characteristics of the data center by physically laying out simulators which is a tedious and time consuming task to be implemented for a large scale data center. Moreover, some of important cooling characteristics of the data center cannot be determined using these gross and preliminary indices based on temperature measurements alone.
The above mentioned prior arts failed to determine all the cooling characteristics of a data center due to their limited scope. It discloses a set indices the indices attempting to minimize cooling cost such as design optimization, dynamically controlling cooling infrastructure and dynamic workload placements and fast temperature prediction. However, these are based on inadequate information regarding cooling characteristics and hence can not be considered a useful method.
Influence indices disclosed in the present invention are functions of both airflow and temperature. So they provide a picture of the cooling performance of a data center. They quantify various phenomena occurring in the data center and quantify the amount of influence each component will have on all the components. This complete determination of cooling characteristics can be used to determine reasons behind problems such as hot spots and cooling inefficiencies. Hence, optimization carried out using influence indices will result in configuration of the data center which is optimized from all aspects. The causal relationship set up by influence indices can also be used to pinpoint the locations best suited for increasing heat load from cooling perspective. Moreover, it can also be used for fast prediction of temperatures at any point at various operating parameters on CRAC and power consumption of racks in the data center. This fast prediction of temperatures completes calculations within seconds compared to hours required for CFD calculations.
Thus, in the light of the above mentioned background art, it is evident that, there is a need for a solution that can provide a method for complete determination of cooling characteristics of a data center by calculating thermal influence indices which are based on information related to configuration of data center, air flow, temperature and heat pertaining to the source and target components in a data center.
Hence, due to the drawbacks of the conventional approaches there remains a need for a new solution that can provide a method and system for complete determination of cooling characteristics of a data center.