1. Field of the Present Invention
The present invention generally relates to the field of microprocessor based computers and more particularly to a method of synchronizing timestamp registers and performance counters in a multiprocessor system.
2. History of Related Art
A number of high performance superscalar microprocessors such as the PowerPC(copyright) 604 processor from IBM Corporation, the Pentium(copyright) family of processors from Intel Corporation, and the Sparc(copyright) family of processors from Sun Microsystems, Inc., facilitate system performance monitoring by incorporating user accessible timestamp and performance counters (collectively referred to herein as performance monitors). The timestamp register is typically implemented as a dedicated counter that is set to zero following a hardware reset and possibly at other times through user control. After reset, the timestamp register is incremented every processor clock cycle, even if the processor is halted, to provide a facility for calculating the number of clock cycles used to execute a task. Performance counters permit processor performance parameters to be monitored and measured. The information obtained from these counters can then be used for tuning system and compiler performance. Typically, the performance counters support counting of a variety of processor events such as, for example, the number of cache hits, the number of cache misses, the number of instructions issued, and the number of instructions completed. Those familiar with the operation of speculative, out-of-order, superscalar microprocessors will appreciate that the ability to monitor such processor performance characteristics is highly beneficial in evaluating the efficiency with which the processor is executing.
Increasingly, data processing systems are implemented with multiprocessor architectures in which two or more processors are interconnected to increase the performance capability of the system. In a multiprocessor system, an application program or thread may execute certain instructions on a first processor, other instructions on a second processor, and so forth. Monitoring the performance of multiprocessor systems through the use of the timestamp and the performance counter facilities is difficult unless the performance monitors of each processor in the system are synchronized (i.e., simultaneously reset or set to a known value). Synchronization ensures that the timestamp register and performance counters of each processor are measuring events that occur in the same period of time. Unfortunately, conventional multiprocessor systems designed with commercially distributed microprocessors typically lack dedicated hardware facilities for synchronizing the performance counters of the various processors that comprise the system. It is, therefore, highly desirable to implement a solution which synchronizes the performance monitors of each processor in a multiprocessor data processing without adding significant cost or complexity to the system.
The problems identified above are in large part addressed by a method, system, and computer readable medium for synchronizing performance monitors in a multiprocessor system as disclosed herein. The system includes a lead processor and at least one slave processor. The method includes informing the slave processor that a synchronization signal is forthcoming and waiting for an acknowledgment indicating that the slave processor is ready to receive the synchronization signal. In response to the slave processor""s acknowledgment, the method includes sending the synchronization signal to the slave processor. The lead processor""s performance monitors are set when the synchronization signal is sent and the slave processor""s performance monitors are sent when the synchronization signal is received by the slave processor. In one embodiment, informing the slave processor that a synchronization signal is forthcoming is achieved by issuing a first inter-processor interrupt. In one embodiment, the sending of the synchronization signal is achieved by issuing a second inter-processor interrupt. In one configuration, waiting for the acknowledgment is accomplished by executing a spin loop with the lead processor. In one embodiment, the values set in the lead processor performance monitors are offset from the value set in the slave processor performance monitors, preferably by an offset that is indicative of the delay required for the synchronization signal to propagate from the lead processor to the slave processor.