1. Technical Field
The present invention relates generally to an improved data processing system and, in particular, to a method and system for improving performance of the processor in a data processing system. Still more particularly, the present invention relates to a method, apparatus, and computer instructions for local code reorganization using branch count per instruction hardware.
2. Description of Related Art
In a computer system, branch prediction is a technique used to guess whether a conditional branch will be taken or not. If it is predicted that a conditional branch will be taken, the processor will prefetch code for the branch instruction from the appropriate location. A speculative execution is performed to take advantage of branch prediction by executing the instruction before the processor is certain that they are in the correct execution path. For example, if a branch is taken more than 90 percent of the time, it is predicted to be taken and the processor will prefetch the code prior to reaching the branch instruction.
A branch instruction may be conditional or unconditional. A conditional branch instruction causes an instruction to branch or jump to another location of code if a specified condition is satisfied. If the condition is not satisfied, the next instruction in sequential order is fetched and executed.
A special fetch/decode unit in a processor uses a branch prediction algorithm to predict the direction and outcome of the instructions being executed through multiple levels of branches, calls, and returns. Branch prediction enables the processor to keep the instruction pipeline full while running at a high rate of speed. In conventional computer systems, branch prediction is based on branch prediction software that uses branch statistics and other data to minimize stalls caused by delays in fetching instructions that branch to nonlinear memory locations.
In some cases, the code of a program can be locally reorganized to improve performance. Such code reorganization is typically based on software generated statistics to determine whether local code reorganization is advantageous. However, such software generated statistics require use of resources that may in some cases be better allocated to other tasks, while hardware resources that may be present go unused, resulting in an inefficient use of overall resources.
Therefore, it would be advantageous to have an improved method, apparatus, and computer instructions for providing branch count per instruction statistics that allow a program to autonomically perform local code reorganization, so that processor performance may be optimized.