A. Field of the Invention
The present invention relates to pipeline processing, and more particularly to a microprocessor with a pipeline circuit that is capable of usurping a waited pipeline bus request.
B. Description of the Prior Art
Pipeline processing is a way of processing information. A pipeline consists of different units that perform tasks on information. Information is worked on by each unit in the pipeline. After a first unit has completed its work on the information, the first unit passes the information to another unit. The work done on the information is not completed until it has passed through all the units in the pipeline.
The advantage of pipelining is that it increases the amount of processing per unit time. This results in instructions being handled in less cycles.
Although the pipeline process increases the speed in which an instruction is processed, it has problems handling vector or branch instructions. A branch or vector instruction requires a microprocessor to request a sequence of instructions that differs from instructions that have already been requested. This results in instructions in the pipeline that are no longer needed.
In FIG. 1, an exemplary diagram of a prior art microprocessor 100 using pipeline processing is shown. The Fetch Unit 110 is communicatively connected to the Decode Unit 115, the Vector Request signal 165, the Branch Request signal 170 and the Bus Interface Unit (“BIU”) 135. The Decode Unit 115 is communicatively connected to the Execute Unit 120. The Execute Unit 120 is communicatively connected to the Data Memory Access Unit 125 and the Fetch Unit 110. The Data Memory Access Unit 125 is communicatively connected to the Register Write-Back Unit 130 and a Memory 160. The Register File 105 is communicatively connected to the Fetch Unit 110, Decode Unit 115, Execute Unit 120, and Register Write-Back Unit 130.
The BIU 135 utilizes a Memory Request 140, also referenced as a Fetch Request 140, Address_Size_Control lines 145, an Instruction bus 150 and a Wait line 155 to communicate with the Fetch Unit 110.
The BIU 135 is memory storage used to obtain and hold prefetched instructions. The Fetch Unit 110 requests and fetches instructions from the BIU 135. The Decode Unit 115 decodes the fetched instructions. The Execute Unit 120 executes the decoded instructions. The Data Memory Access Unit 125 accesses Memory 160. The Register Write-Back Unit 130 writes results received from the Data Memory Access Unit 125 into the Register File 105. The Vector Request signal 165 indicates when a vector has occurred. The Branch Request signal 170 indicates when a branch has occurred.
The Microprocessor 100 typically receives instructions (n to n+9, shown in FIG. 2) as inputs. The Fetch Unit 110 requests and grabs instructions from the BIU 135. As described previously, the BIU 135 obtains and stores instructions. The BIU 135 serves to reduce the amount of time the Fetch Unit 110 takes to obtain an instruction. By having the instructions available at the BIU 135, the Fetch Unit 110 does not have to spend additional cycles searching for an instruction.
When the Fetch Unit 110 grabs an instruction, it also requests another instruction. Requesting an instruction before it is needed is known as prefetching. By requesting that the BIU 135 prefetch an instruction, the Fetch Unit 110 can further reduce the amount of time it has to wait to receive an instruction. After the Fetch Unit 110 has requested an instruction, the BIU 135 will either provide the valid instruction or wait the Fetch Unit 110 during subsequent clock periods.
Whenever a requested instruction is not immediately available to the Fetch Unit 110, the BIU 135 waits the Fetch Unit 110 by driving the Wait signal 155 active. This indicates to the Fetch Unit 110 that it needs to wait to receive the request and to wait before making any additional prefetch requests. However, the Fetch Unit 110 will have made a second request, before receiving the Wait signal 155. Therefore, two requests will be made before the Wait signal 155 is sampled as being active by the Fetch Unit 110.
When the Fetch Unit 110 receives instruction n from the BIU 135, the Fetch Unit 110 next requests instruction n+1. At the next clock cycle, if the Wait signal 155 has not been driven active by the BIU 135, n+2 is requested by the Fetch Unit 110. The Fetch Unit 110 receives n+1 and the Decode Unit 115 receives n. This process will continue throughout the Microprocessor 100 until n has passed through each unit and a result is written to the Register File 105.
If the Wait signal 155 is driven active from the BIU 135 during this process, it will force the Fetch Unit 110 to wait before it receives the requested instruction. This momentarily stops the flow of instructions through all the units.
As described earlier, instructions proceed through the units in the Microprocessor 100. Sometimes an instruction that arrives at the Execute Unit 120 is a branch or vector instruction. As discussed previously, a branch or vector instruction requires the Microprocessor 100 to request a different sequence of instructions. Therefore, any instruction in the pipeline that had been prefetched by the pipeline before the vector or branch instruction occurs is now unneeded.
A problem with pipeline processing is that there is no way to prevent the unneeded prefetched instruction from proceeding through the pipeline. These unneeded instruction will slow down the processor since they still have to be processed, even though they are unneeded.
In FIG. 2, a timing chart illustrating the processing that occurs in the Microprocessor 100 in the absence of a vector or branch instruction. The clock 205 shows the clock cycles, while the Address, Size, Control signals 145 indicate the associated instruction request information signals. Fetch Request 140 identifies which instruction has been requested by the Fetch Unit 110. Wait 155 indicates when the BIU 135 needs additional time to obtain the instruction. Instruction bus 150 indicates when the valid instruction has been fetched by the Fetch Unit 110.
As can be seen in the FIG. 2, each Fetch Request 140 that is made is fetched on the next clock cycle except on instruction n+4. At n+4, a Wait signal 155 is requested while the BIU 135 looks for n+4 and n+5. Therefore, although n+4 is requested on clock cycle five, the instruction is not completely received by the Fetch Unit 110 until clock cycle seven. Since instruction n+4 received an active Wait signal 155, Fetch request 140 n+5 is also delayed an additional clock period before it can be obtained by the Fetch Unit 110.
In FIG. 3, a timing chart illustrates how an unneeded prefetched instruction is typically handled by prior art Microprocessor 100. Vector Indicated 305 identifies on which clock signal a vector occurred. As was discussed previously, when a vector or branch instruction occurs, new instructions are required from the BIU 135. A previously fetched instruction is no longer needed, since the vector or branch instruction now requires new instructions.
In FIG. 3, instruction n has been requested by the Fetch Unit 110. While the BIU 135 is working on obtaining n, the Fetch Unit 110 requests n+1. The BIU 135 sends a Wait signal 155 to the Fetch Unit 110 to indicate that it is working on n. Therefore, no additional instructions beyond n+1 may be requested. At clock cycle five, n has been fetched and n+1 is still being worked on. However at clock cycle three, a vector occurred and new instructions will have to be requested from the BIU 135. At clock cycle six, the new instruction V has been requested. Since the vector occurred, instruction n+1 is no longer needed. The next instruction that is needed is V. However, the Microprocessor 100 in FIG. 1 will fetch the unneeded instruction n+1 before it fetches the instruction V. The instruction n+1 slowed down the Microprocessor 100 since it had to be processed, even though it was unneeded.
One solution that has developed to address this problem is speeding up the Execute Unit of a pipeline by obtaining both possible next instructions, one instruction in case there is a branch and one instruction in case there is no branch. This solution, however, requires that both instructions be obtained simultaneously.
Embodiments consistent with the present invention are directed at overcoming one or more of the aforementioned problems.