A High Performance Computing (HPC) system performs parallel computing by simultaneous use of multiple nodes to execute a computational assignment referred to as a job. Each node typically includes processors, memory, operating system, and input-output (I/O) components. The nodes communicate with each other through a high speed network fabric and may use shared file systems or storage. The job is divided in thousands of parallel tasks distributed over thousands of nodes. These tasks synchronize with each other hundreds of times a second. Usually, a HPC system can consume megawatts of power.
Growing usage of HPC systems in the recent years have made power management a concern in the industry, especially in terms of energy efficiency. Future systems are expected to deliver higher performance, while operating under less than the power allocation that they are designed for. In response to this demand, future HPC systems will be measured decidedly on how well the systems perform while operating with a limited power allocation. However, there is no standard method of measurement or benchmarking technique to measure energy efficiency of HPC systems while operating with and without power constraints.
Currently, operating a system without exceeding its allocated power budget requires technologies to monitor power, distribute the power budget to jobs running on the system, and maximize performance while operating jobs on one or more nodes under a power constraint. However, conventional benchmarking techniques are not designed to measure the impact of the application of these technologies.
The most commonly used benchmarking techniques include SPECpower, which measures energy proportionality between power consumed in a server and its utilization, or Green500, which measures energy efficiency of an HPC system while operating LINPACK without any power constraints. Typically, SPECpower is used in the server industry and measures energy proportionality when the servers are not 100% utilized. As such, a good SPECpower benchmark reading can be manipulated by using low-power servers that are operating during long periods of time. On the other hand, Green500 measures energy efficiency of an HPC system when the system is achieving its peak performance without any power constraints.
Typically, users of an HPC system expect the system to run at its optimal energy efficiency level, while operating under a dedicated power constraint on the performance of the system. Furthermore, growth in performance of HPC systems is expected to be exponential, and similarly the demand for energy and power is expected to grow significantly. As such, future systems may lack the required power allocation to run the systems efficiently, especially since the power allocation will be dynamic and many times lower than the peak demand. Therefore, conventional measurements of energy efficiency can be totally misleading, and fail to account for varying levels of power constraints.