1. Technical Field
This invention relates in general to data processing system performance and more particularly, for a next to complete instruction group during a pipeline stall in an out of order processor, to identifying the finish of an oldest, previously unfinished instruction in the next to complete instruction group, aligning finish reports providing completion reasons from functional units with oldest instruction finish indicators, and determining each stall reason for each oldest instruction from the aligned finish reports.
2. Description of the Related Art
An instruction is generally executed in stages or components within a processor or processors. The components for completing execution of an instruction may perform functions including fetching the instruction, decoding the instruction, dispatching the instruction, issuing the instruction to appropriate functional unit, executing the instruction, and writing the result of the operation to memory and registers. When the result is written to memory and registers, the result of performing the operation becomes visible or available to other instructions and processes.
Data processing systems, and in particular processors within data processing systems, frequently experience stalls, which include any events that delay the completion of one or more instructions by a clock cycle or more, by the components. Stalls may occur for multiple reasons including, but not limited to, branch mispredictions, delay in accessing data due to cache misses, and high latency instructions, such as floating point operations.
Stalls reduce the overall performance of processors, and thus the overall performance of data processing systems. A significant number of stalls may seriously degrade processor performance. In processors that execute instructions out of order or speculatively, it is typically more convenient and accurate to study the performance of components after the instruction completes. By studying the performance of the components and identifying reasons for stalls, a user may attempt to make adjustments to correct a problem or reduce the number of stall cycles in a particular processor.
Determining the exact cause of instruction completion stalls after the instruction completes, however, increases in difficulty as the number of types of causes of instruction stalls increases, as the number of processors simultaneously handling instructions increases, and when processors execute instructions in an out-of-order manner.
In addition, determining the exact cause of instruction completion stalls after the instruction completes increases in difficulty when groups of instructions are processed in an out of order manner, but completed together, such as in a processor that completes groups of instructions in an instruction pipeline. Execution of a group of instructions in an instruction pipeline is not complete until every instruction in the group is complete. If completion of the group stalls, the stall cycles could be due to a delay occurring in any one or more of the instructions in the group, each in various stages of component execution. In addition, due to dependencies between instructions within the group, a stall in one instruction may block completion of dependent instructions, where the dependent instruction does not cause the stall, but the completion of the dependent instruction is delayed because of the stall.