1. Field of the Invention
The invention pertains generally to computer processing. In particular, it pertains to making predictions of processing steps using global predictors.
2. Description of the Related Art
One of the techniques used to increase overall procesor speed is the well-known use of instruction execution pipelines. Since the execution of each instruction is actually a series of sequentially executed sub-steps, an assembly line process can be used. Instructions are fed into the input of an execution pipeline with a series of stages. Each stage performs one of the sub-steps and then passes the instruction to the next stage for performing the next sub-step. A simplified example would be: first stagexe2x80x94decode the instruction type, second stagexe2x80x94retrieve relevant data, third stagexe2x80x94operate on the data, fourth stagexe2x80x94store the result. Most actual pipelines have many more stages than this, for example twenty stages, with each stage performing a simpler task than those just recited. Multiple instructions can be in the pipeline at the same time, each undergoing a different sub-step in a different stage at the same time, so overall throughput is much higher than if each instruction had to be completed before the next could be started.
If an instruction is a conditional branch instruction, the selection of instructions hat should immediately follow it in the pipeline depends on whether or not the branch is taken. The fact that an instruction is a branch instruction is determined early in the pipeline, but whether the branch will be taken is not determined until the later stages of the pipeline, after several of the subsequent instructions have already been loaded into the pipeline. Because of this, microprocessor architectures incorporating instruction execution pipelines usually include some methodology for predicting the outcome of branch instructions so that the program instructions that are predicted to follow the branch instruction can be loaded into the pipeline. If the prediction is wrong, the instructions behind the branch instruction must be flushed from the pipeline and the correct instructions then loaded, thereby causing a delay. Thus, the accuracy of the predictive method is important to the performance of the system.
One conventional predictive approach is to look at a predetermined number (N) of branches immediately preceding the current branch, and then use the pattern of branching within that group as an index into a predictive logic array that actually predicts the branch status (branch or don""t branch) of the branch instruction. For example, a global history array may be used to keep an ongoing track of the behavior of an N number of branches preceding each branch to be predicted. The contents of the history array can then be used as an index into a prediction array, with each location in the prediction array containing prediction logic for the specific branch being indexed. To make each index unique to a specific branch and also history-dependent, each branch""s history index can be operated on by a unique parameter such as the branch""s instruction pointer, to produce the index for the next branch. Thus, the history index for a given branch instruction is based partly on the branching path leading up to that branch instruction. If the branch history generation logic misses a particular branch, all the following branch prediction indexes will be corrupted, and the global history generation process must be restarted.
The difficulty of using this prediction method with an instruction execution pipeline is shown in FIG. 1. FIG. 1 shows an instruction execution pipeline 10 with ten stages labeled S1-S10. At stage S2, the type of instruction (such as a conditional branch instruction) is identified. If the instruction is a branch instruction, at stage S3 the parameter for predicting the branch status is extracted. At stage S4, the prediction of branch status is actually made. At stage S8, the actual branch status of the branch instruction can be identified, and the possibility of a pipeline flush becomes evident.
Ten instructions I1-I10 have been sequentially loaded into the pipeline at input 8, and are progressing through the pipeline from left to right. In the example shown, instructions I3, I5, I8 and I9 are branch instructions labeled BR1, BR2, BR3 and BR4, respectively. As soon as BR4 enters the pipeline at stage 1, its branching status needs to be predicted so the instructions following BR4 can be determined and loaded into the pipeline immediately after it. If an accurate prediction is to be made for BR4, two conventional approaches are typically used: 1) Wait until BR4 reaches stage 4 to make a prediction. This reduces the effectiveness of making a prediction, since stages S1-S3 will be initially filled without the benefit of the prediction, and a pipeline flush of these stages is likely. 2) Maintain information on every branch needed to identify the branch and create its global history, so that the global histories of BR2 and BR3 can be generated and used to generate a global history for BR4 in pipeline stage S1. This requires extensive circuitry for the prediction logic, since data must be kept for every branch instruction, and if information for BR2 and BR3 is missing due to cache capacity issues or for any other reason, BR4 gets a corrupted global history that negatively impacts performance.
The invention includes a method of providing at least three elements, including first and last elements, with each element having an associated parameter, and further providing an index for the first element. For the first sequential execution of the elements, a first operation is performed on the first index and at least one of the parameters to produce a transform. The transform is then saved. For the second sequential execution of the elements, a second operation is performed on the transform to produce a last index, which is associated with the last element.