1. Technical Field
The present invention relates generally to an improved data processing system and, in particular, to a method and system for monitoring instruction execution within a processor in a data processing system.
2. Description of Related Art
In typical computer systems utilizing processors, system developers desire optimization of software execution for more effective system design. Usually, studies are performed to determine system efficiency in a program""s access patterns to memory and interaction with a system""s memory hierarchy. Understanding the memory hierarchy behavior helps in developing algorithms that schedule and/or partition tasks, as well as distribute and structure data for optimizing the system.
Within state-of-the-art processors, facilities are often provided which enable the processor to count occurrences of software-selectable events and to time the execution of processes within an associated data processing system. These facilities are known as the performance monitor of the processor. Performance monitoring is often used to optimize the use of software in a system. A performance monitor is generally regarded as a facility incorporated into a processor to monitor selected characteristics to assist in the debugging and analyzing of systems by determining a machine""s state at a particular point in time. Often, the performance monitor produces information relating to the utilization of a processor""s instruction execution and storage control. For example, the performance monitor can be utilized to provide information regarding the amount of time that has passed between events in a processing system. As another example, software engineers may utilize timing data from the performance monitor to optimize programs by relocating branch instructions and memory accesses. In addition, the performance monitor may be utilized to gather data about the access times to the data processing system""s L1 cache, L2 cache, and main memory. Utilizing this data, system designers may identify performance bottlenecks specific to particular software or hardware environments. The information produced usually guides system designers toward ways of enhancing performance of a given system or of developing improvements in the design of a new system.
Events within the data processing system are counted by one or more counters within the performance monitor. The operation of such counters is managed by control registers, which are comprised of a plurality of bit fields. In general, both control registers and the counters are readable and writable by software. Thus, by writing values to the control register, a user may select the events within the data processing system to be monitored and specify the conditions under which the counters are enabled.
As one method of monitoring the execution of instructions in a processor, either for monitoring purposes or for debug purposes, a method called instructions sampling has been used. One or more instructions are selected, i.e. sampled, and detailed information about the sampled instruction is collected as the instructions execute. Existing instruction sampling techniques sample an instruction based on the instruction""s location in an internal queue, which lacks the granularity or control necessary for robust monitoring of instruction execution.
Therefore, it would be advantageous to have a method and apparatus for accurately monitoring the execution of instructions within a processor. It would be further advantageous to have a method and apparatus for sampling particular types of instructions for monitoring.
A method and apparatus for selecting an instruction to be monitored within a pipelined processor is presented. One or more pairs of match values stored in control registers are allocated for use in instruction sampling or instruction matching. These pairs, referred to as V0 and V1, are used together to filter instructions for sampling or for instruction matching. During the fetch or decode stage, the instruction word is compared bit by bit to the V0 and V1 pair(s). For each bit in the instruction word, the corresponding bit in V0 and V1 are used to determine if a match exists. If every bit position in the instruction word results in a match, the instruction is eligible for sampling. If any bit position does not match, the instruction is not eligible. In response to a determination that the instruction is eligible for sampling, the execution of the instruction may be monitored.