1. Technical Field
The invention disclosed broadly relates to digital computer processing systems and more particularly to pipelined data processing systems including branch prediction.
2. Background Art
Data processing systems generally include a central processor, associated storage systems and peripheral devices and interfaces. Typically the main memory consists of relatively low cost, high capacity, digital storage devices. The peripheral devices may be, for example, nonvolatile, semi-permanent storage media such as magnetic disks and magnetic tape drives. In order to carry out tasks, the central processor of such a system executes a succession of instructions which operate on the data. The succession of instructions and the data those instructions reference are referred to as a program.
In the operation of such systems, programs are initially brought to an intermediate storage area, usually in the main memory. The central processor may then interface directly to the main memory to execute the stored program. However, this procedure places limitations on performance due principally to the relative long times required in accessing that main memory. To overcome these limitations, a high speed storage system, in some cases called a cache is used to hold currently used portions of program within the central processor itself. The cache interfaces with the main memory through memory control hardware which handles program transfers between the central processor main memory and the peripheral device interfaces.
One form of computer has been developed in the prior art to concurrently process a succession of instructions in a so-called pipeline manner. In such pipeline processors, each instruction is executed in part at each of a succession of stages. After the instruction has been processed at each of the stages, the execution is complete. With this configuration, an instruction is passed from one stage to the next. That instruction is replaced by the next instruction in the program. Thus, the stages together form a pipeline which at any given time, is executing in part, a succession of instructions. Such instruction pipelines, processing a plurality of instructions in parallel, are found in several digital computing systems. These processors consist of a single pipeline of varying length and employ hardwired logic for all data manipulation. The large quantity of control logic in such machines is difficult to handle, for example, conditional branch instructions, make them extremely fast, but also very expensive.
The present invention relates to branch prediction mechanisms for handling conditional branch instructions in a computer system. When a branch instruction is encountered, it is wasteful of the computer resource to wait for resolution of the instruction before proceeding with the next programming step. Therefore, it is a known advantage to provide a prediction mechanism to predict in advance the instruction to be taken as a result of a conditional branch. If the prediction is successful, it allows a computer system to function without a delay in processing time. There is a time penalty if the prediction is incorrect. Therefore an object of the present invention is to provide an improved branch prediction mechanism with a high prediction accuracy to minimize the time loss caused by incorrect predictions.
In most pipeline processors, conditional branch instructions are resolved in the execution unit. Hence, there are several cycles of delay between the decoding of a conditional branch instruction and its execution. In an attempt to overcome the potential loss of these cycles, the decoder guesses as to which instructions to decode next. Many pipeline processors classify branches according to an instruction field. When a branch is decoded, the outcome of the branch is predicted, based on its class.
An example of a prior art branch prediction scheme is disclosed in U.S. Pat. No. 4,477,872 to Losq, et al. which patent is assigned to the assignee of the present invention. The method disclosed predicts the outcome of a conditional branch instruction based on the previous performance of the branch, rather than on the instruction fields. The prediction of the outcome of a conditional branch is performed utilizing a table which records a history of the outcome of the branch at a given memory location. The disclosed method predicts only the branch outcomes and not the address targets for prefetching an instruction. The present invention is related to patent application Ser. No. 07/783,060 entitled "Synchronizing a Prediction RAM," assigned to the assignee of the present invention, filed Oct. 25, 1991, its teachings are herein incorporated by reference. Disclosed is a high speed, pipelined CPU which breaks large execution flows into stages to allow a dramatic improvement in the system latency between registers. The multitude of stages allow better observability for testing and debugging of the overall system.
The performance enhancement of the pipeline processor is dependent on the degree to which each stage of the pipeline is kept busy processing its instructions and passing the results onto the next stage. In an ideal environment, each instruction would pass through a new stage every clock cycle. With this assumption, instruction execution time would be equal to the clock cycle time after the start-up latency has filled the pipeline. A serious degradation of pipeline performance improvement can result when branch instructions cause the pipeline to be flushed and restarted with a new instruction stream. It is desirable to know the result of a conditional branch instruction when instructions are being fetched. Unfortunately, this is not always possible, because conditional branches are often dependent on the instruction immediately preceding them in the pipeline.