1. Field of the Invention
The present invention generally relates to general purpose digital data processing systems and more particularly relates to such systems which employ pipelined execution of program instructions.
2. Description of the Prior Art
In most general purpose, stored program, digital computers, software is developed under the assumption that program instructions are executed in their entirety in a sequential fashion. This frees the software developer from the need to account for potential problems associated with initial processing of an instruction before completion of the preceding instruction. However, most large scale modern machines are designed to take advantage of the overlapping of various functions. In its simplest form, such overlapping permits instruction processing of the N+1st instruction to be performed during operand processing of the Nth instruction. U.S. Pat. No. 4,890,225 issued to Ellis, Jr. et al. shows a rudimentary overlapped machine. To free the software developer from concerns about non-sequentiality, Ellis Jr. et al. store the machine state during the complete execution of the Nth instruction. U.S. Pat. No. 4,924,376 issued to Ooi provides a technique for resource allocation in an overlapped environment.
A more general form of overlapping is termed a pipelined environment. In implementing such a machine, the designer dedicates certain hardware resources to the various repetitive tasks. The performance advantage in this dedication comes from employing these dedicated hardware elements simultaneously. Typically, this means that instruction decode, operand fetch, and arithmetic operations each have separate and dedicated hardware resources. Even though the Nth instruction is processed by each of these hardware resources sequentially, each separate hardware resource is deployed on a different instruction simultaneously. The N+1st instruction may be processed by the instruction fetch and decode hardware, while the Nth instruction is being processed by the operand fetch hardware and while the N-1st instruction is being processed by the arithmetic hardware. U.S. Pat. No. 4,855,904 issued to Daberkow, et al. describes a pipelined architecture.
The most common problem encountered occurs with branching instructions. If the system assumes that the instruction to be executed immediately after the Nth instruction is located at the next sequential address, all branches pose a potential problem. U.S. Pat. No. 4,604,691 issued to Akagi discusses a system in which instructions are prefetched to provide the performance advantage of overlapped operation. The system of Akagi attempts to predict whether a particular branch will be taken to prevent the performance degradation associated with prefetching of an incorrect instruction.
It becomes even more beneficial to properly accommodate branching instructions within a pipelined environment as indicated in U.S. Pat. No. 4,860,199 issued to Langendorf et al. In U.S. Pat. No. 4,916,602 issued to Itoh, the address of the next sequential microcode instruction is computed. If the previous instruction is then determined to indicate a branch, the branch target address is substituted for the next sequential address. U.S. Pat. No. 4,390,946 issued to Lane describes a pipelined system in which it is assumed that a branch will not be taken. If the branch actually occurs, the system is "depiped" for one clock cycle thereby degrading performance. A similar performance degradation occurs for a conditional skip instruction as discussed in U.S. Pat. No. 4,926,312 issued to Nukiyama. In this system if the conditional skip is taken, the next sequential instruction is invalidated after having been fetched and a no-op is executed instead. This corresponds to depiping the system for the period of time corresponding to the execution of the no-op instruction. A portion of this problem is mitigated by duplicating some of the hardware used for addressing as suggested by U.S. Pat. No. 4,827,402 issued to Wada.
One manner of preventing the performance degradation caused by branching is to force the instruction processor not to branch. U.S. Pat. No. 4,870,573 issued to Kawata et al. uses this method for test purposes. U.S. Pat. No. 4,831,517 issued to Crouse et al. describes the unique problem associated with branches which return on address (i.e. return jumps). Optimization of branch conditions in loop processing is discussed in U.S. Pat. No. 4,910,664 issued to Arizono.
Both the jump target address and corresponding target instruction are stored within the instruction cache of the system described in U.S. Pat. No. 4,847,753 issued to Matsua et al. Target instructions are also saved in the instruction processor of U.S. Pat. No. 4,926,323 issued to Baror et al. U.S. Pat. No. 4,942,520 issued to Langendorf shows a technique for accessing target branch instructions using an index to an instruction cache memory. Through the use of an associative memory, U.S. Pat. No. 4,912,635 issued to Nishimukai et al. simplifies the target instruction addressing. The system of U.S. Pat. No. 4,894,772 issued to Langendorf also uses an associative memory.
A branch history table is used along with an associative memory in U.S. Pat. No. 4,764,861 issued to Shibuya; U.S. Pat. No. 4,984,154 issued to Hanatani et al; and U.S. Pat. No. 4,853,840 issued to Shibuya. U.S. Pat. No. 4,477,872 issued to Losq et al. employs a decode time history table for predicting whether a conditional branch will or will not be taken. Another type of branch prediction circuit is shown in U.S. Pat. No. 4,370,711 issued to Smith. The branch history table is expanded to include both active and passive sections in U.S. Pat. No. 4,679,141 issued to Pomerence et al.
There are a number of techniques employed in the attempt to improve the accuracy of prediction for conditional branches. U.S. Pat. No. 4,763,245 issued to Emma et al. examines the operand to improve prediction accuracy. Operands are similarly used for branch prediction in U.S. Pat. No. 4,914,579 issued to Putrino et al. In U.S. Pat. No. 4,755,966 issued to Lee et al. a larger displacement of the potential branch as determined from the instruction operand, is assumed to represent less likelihood of taking a conditional branch. U.S. Pat. No. 4,777,594 issued to Jones et al. monitors instruction flow to assist in conditional branch prediction. The branch prediction table is accessed using information from the instruction prior to the conditional branch instruction in U.S. Pat. No. 4,858,104 issued to Matsuo et al.