Generally, a High Performance Computing (HPC) system performs parallel computing by simultaneous use of multiple nodes to execute a computational assignment referred to as a job. Each node typically includes processors, memory, operating system, and input-output (I/O) components. The nodes communicate with each other through a high speed network fabric and may use shared file systems or storage. The job is divided in thousands of parallel tasks distributed over thousands of nodes. These tasks synchronize with each other hundreds of times a second. Usually an HPC system consumes megawatts of power.
Typically, HPC jobs run on a large number of compute nodes, IO nodes and operating system (OS) nodes. Typically, there are multiple HPC jobs in a single HPC cluster or HPC cloud. The jobs may share the same node at the same time. For example, the jobs may use the same non-volatile storage attached to the same IO node to save their private data. There is also tendency that a single compute node may serve more than one HPC jobs at a time.
Currently there is no technique to obtain the node power breakdown per job, indicating which portion of the node power belongs to which job. Traditionally, it is assumed that compute nodes are exclusively used by HPC jobs, which means that one single compute node can only serve one single HPC job at a time until this job is suspended or completed.
Conventional power monitoring techniques cannot be accurate as they do not provide per job power breakdown on the nodes. For example, for traditional in-house cluster based storage or network intensive HPC jobs, power monitoring inaccuracy can be as high as about 25%. If compute nodes are shared, power monitoring inaccuracy can add up to about 50%. For cloud based HPC or big data jobs, because substantially every node is shared and job scheduling is very dynamic, the conventional power monitoring result can be totally misleading.