1. Field of the Invention
The present application is related generally to a data processing system and in particular to a method and apparatus for performance monitoring. More particularly, the present application is directed to a computer implemented method, apparatus, and computer usable program code for identifying stall cycles attributable to a given instruction in a group of instructions executing in an instruction pipeline.
2. Description of the Related Art
Performance monitoring of microprocessors includes the calculation of the average cycles per instruction (CPI) required to complete execution of an instruction. Typically, a reduced instruction set computer (RISC) microprocessor is capable of completing the execution of one or more instructions during a single processor clock cycle.
An instruction is generally executed in stages or components. The components for completing execution of an instruction typically include fetching the instruction, decoding the instruction, performing the operation, and writing the result of the operation to memory and/or a register. When the result is written to memory and/or a register, the result of performing the operation becomes visible or available to other instructions and processes.
Processor performance can be analyzed by breaking the cycles per instruction into components of execution to determine which parts of the instruction execution are consuming the most processor cycles. In processors that execute instructions out of order or speculatively, it is more convenient or accurate to study the performance of the components of execution after the instruction completes.
Processor cycles consumed during execution of an instruction or group of instructions without an instruction completing are referred to as stall cycles. Stall accounting is the process of monitoring stall cycles, identifying which instruction is responsible for the stall, and determining a reason for the stall.
If a user knows which instruction is stalling and a reason for the stall, the user may be able to correct the problem to avoid or reduce the number of stall cycles. For example, if a load instruction is causing excessive stall cycles due to memory access for a needed data value, the number of stall cycles can be reduced by caching the needed data value.
In processors that complete one instruction at a time, stall accounting is fairly straightforward. Any stall occurring is attributable to the one instruction that completed. However, processors that complete groups of instructions in an instruction pipeline, such as the IBM® POWER5®, are more difficult to analyze.
In an instruction pipeline, multiple instructions in various stages of component execution are being handled in an assembly line fashion by the processor. While the operation of one instruction is being executed by the arithmetic and logic unit (ALU), a next instruction can be loaded to cache and a result of executing another instruction can be written to a register. A group of two or more instructions can be handled at various stages of completion at the same time. Execution of the group of instructions is not complete until every instruction in the group is complete. If completion of the group stalls, the stall cycles could be due to a stall occurring in any one or more of the instructions in the group.
There may not be a single reason that completion of the group of instructions stalled because each instruction can have its own reason for stalling. However, within the individual instruction blockages, a reason may exist which blocks the entire group. For example, a stall occurring in the last instruction to complete results in the completion of the entire group stalling.
Currently, performance monitoring identifies the source for the last instruction completion delay in a group of instructions and attributes this source as the reason for the entire group stalling. This method is useful for analysis but may not accurately describe completion delays encountered by the group of instructions. Moreover, current methods that attribute the entire delay in completion encountered by a group of instructions to the last known delay can result in misleading stall accounting if one or more instructions in the group are dependent on completion of another instruction in the group.