Distributed computer systems, such as high end symmetric multiprocessing (SMP) server systems, need to measure operating parameters throughout the computer system to monitor operation and obtain data for control operations. One example of an operating parameter that is measured is power, the measurement of which has become increasingly important both in reducing power usage and in controlling power flow throughout the computer system. The range of use for the measured power data extends from the simple passing of the data to an external program to the complex changing of the power performance state of the computer system based on the data. In low end computer systems, hardware is added to measure power usage before the power distribution system branches out, providing total power measurement but requiring extra hardware at additional expense. In high end computer systems, total power is not measured directly but is estimated, resulting in large errors. Errors require that additional margin be built into control calculations and that power systems be made larger to account for the error. Both of these add to the cost of computer systems in initial cost and during operation.
High end SMP server systems typically include a number of distributed points providing power measurements. These distributed points measure power consumed by server entities such as processor cores, memory units, and/or input/output devices. The SMP server system makes system level decisions based on the power measurements at the distributed points. Unfortunately, present SMP server systems take the power measurements at different times, making it impossible to accurately determine power at a particular point at a particular time or total power at a particular time.
FIGS. 1A and 1B illustrate the problem with taking power measurements at different times. FIG. 1A is a graph of actual power versus time for a distributed computer system. The actual power (labeled as Stave 1, Slave 2, Slave 3) at several slave points within the distributed computer system is summed to provide the actual total system power (labeled as System). FIG. 1B is a graph of indicated power versus time for a distributed computer system. The indicated powers (labeled as Slave 1, Slave 2, Slave 3) at several slave points within the distributed computer system are summed to provide the indicated total system power (labeled as System), which would be seen by a master device. In this illustration, the indicated power is shifted from the actual power for the slaves to reflect differences in slave unit latency between the master device and the slaves. Slave 3 is unshifted, Slave 2 is shifted two time units later, and Slave 1 is shifted four time units later. Data that occurred before real time zero, which does not appear on FIG. 1A, fills the first two time units of Slave 2 and the first four time units of Slave 1 on FIG. 1B. The earlier data decreases the magnitude of the indicated total system power of FIG. 1B relative to the actual total system power of FIG. 1A between zero and four time units in FIG. 1B.
The effect of the shifting is apparent from comparing the actual total system power of FIG. 1A to the indicated total system power of FIG. 1B. For example, the actual total system power increases steeply from zero to four time units, and then levels off. The indicated total system power shows a more gradual increase between one and eight time units, peaks at eight time units, then immediately declines. Any attempt to control system power based on the indicated total system power results in operational errors since the indicated total system power does not reflect the actual total system power.
It would be desirable to have a system and method of measurement for a distributed computer system that would overcome the above disadvantages.