1. Technical Field
The present invention relates generally to an improved data processing system and, in particular, to a method and system for performance monitoring within a processor in a data processing system.
2. Description of Related Art
In typical computer systems utilizing processors, system developers desire optimization of software execution for more effective system design. Usually, studies are performed to determine system efficiency in a program""s access patterns to memory and interaction with a system""s memory hierarchy. Understanding the memory hierarchy behavior helps in developing algorithms that schedule and/or partition tasks, as well as distribute and structure data for optimizing the system.
Within state-of-the-art processors, facilities are often provided which enable the processor to count occurrences of software-selectable events and to time the execution of processes within an associated data processing system. These facilities are known as the performance monitor of the processor. Performance monitoring is often used to optimize the use of software in a system. A performance monitor is generally regarded as a facility incorporated into a processor to monitor selected characteristics to assist in the debugging and analyzing of systems by determining a machine""s state at a particular point in time. Often, the performance monitor produces information relating to the utilization of a processor""s instruction execution and storage control. For example, the performance monitor can be utilized to provide information regarding the amount of time that has passed between events in a processing system. As another example, software engineers may utilize timing data from the performance monitor to optimize programs by relocating branch instructions and memory accesses. In addition, the performance monitor may be utilized to gather data about the access times to the data processing system""s L1 cache, L2 cache, and main memory. Utilizing this data, system designers may identify performance bottlenecks specific to particular software or hardware environments. The information produced usually guides system designers toward ways of enhancing performance of a given system or of developing improvements in the design of a new system.
Events within the data processing system are counted by one or more counters within the performance monitor. The operation of such counters is managed by control registers, which are comprised of a plurality of bit fields. In general, both control registers and the counters are readable and writable by software. Thus, by writing values to the control register, a user may select the events within the data processing system to be monitored and specify the conditions under which the counters are enabled.
To assist in the analysis of the performance of a microprocessor, it would be helpful to understand the utilization of its internal resources. In particular, it would be helpful to understand the level of usage of internal data queues within the processor. These internal queues, such as the load request queue and the global completion table, i.e. the instruction re-order buffer, can cause a processor to stall when they are full. The most straightforward method for monitoring the utilization of internal queues would be to count the number of entries in the queue on every cycle and send this information to the processor""s performance monitor unit for counting. Such a solution is not easy to implement since it may be difficult to count the number of entries and forward the count in a single cycle.
Therefore, it would be advantageous to have a method and apparatus for unobtrusively monitoring the utilization of internal resources within a processor, particularly the utilization of the processor""s internal data queues.
A method and apparatus for monitoring an internal queue within a processor, such as an instruction completion table or instruction re-order buffer, is presented. The performance monitoring unit of the processor contains multiple counters, and each counter counts occurrences of specified events. An internal queue of the processor may be specified to be monitored. A count of event signals indicating a successful allocation request for an entry in the internal queue is divided by a count of event signals indicating a passage of units of time to obtain the average rate for allocation requests for queue entries in the specified internal queue. A count of event signals indicating an occupation of a specific entry in the internal queue during a unit of time is divided by a count of event signals indicating an allocation of a specific entry in the internal queue to obtain the average time spent in the internal queue. An average number of entries in the internal queue is computed as a product of the average rate for allocation requests for queue entries and the average time spent in the internal queue. An event signal that indicates failure of an allocation request for an entry in the internal queue may be monitored.