1. Field of the Invention
The present invention generally relates to computer processing techniques and, in particular, to a superscalar processing system and method that detect write-after-write data hazards while executing instructions and efficiently prevents errors from these hazards by canceling some of the instructions associated with the hazards.
2. Related Art
Parallel processing, sometimes known as superscalar processing, has been developed to reduce the amount of time required to process instructions of a computer program. In parallel processing, at least two pipelines are defined that simultaneously execute instructions. One type of parallel processing is out-of-order processing, in which each pipeline of a processor simultaneously executes different instructions independently of the other pipeline(s).
In out-of-order processing, the instructions are not necessarily input into the pipelines in the same order that they were received by the processor. In addition, it typically takes different amounts of time for different instructions to execute, and it is, therefore, possible for an instruction to be fully executed before another instruction, even though the other instruction was input into its respective pipeline first. Accordingly, instructions are not necessarily executed in the same order that they are received by the pipelines within the processor, and as a result, the complexity required to avoid errors from read-after-write data hazards and write-after-write data hazards, which will be described further below, is relatively large for out-of-order processing.
A xe2x80x9cread-after-write data dependencyxe2x80x9d exists when one instruction to be executed by a processor utilizes, during execution, data retrieved or produced from the execution of another instruction. If the one instruction executes before the other instruction executes, then an error may occur since the one instruction may utilize incorrect data during execution. As a result, to prevent errors, steps should be taken to ensure that the instruction utilizing data retrieved or produced from the execution of another instruction does not execute until the necessary data from execution of the other instruction is available. If a read-after-write data dependency exists and if such steps are not taken, then a xe2x80x9cread-after-write data hazardxe2x80x9d exists, since the read-after-write data dependency may result in the utilization of incorrect data.
A xe2x80x9cwrite-after-write data hazardxe2x80x9d exists when an older instruction, during execution, may write data to the same register or other memory location written to by a younger instruction and incorrectly overwrite valid data written by the younger instruction. An instruction is xe2x80x9cyoungerxe2x80x9d than another instruction when it is received by a processor after the other instruction. Conversely, an instruction is xe2x80x9colderxe2x80x9d than another instruction when it is received by a processor before the other instruction.
As an example of a write-after-write data hazard, assume that a first instruction is a load instruction that retrieves data and writes the retrieved data to a particular register. If the data to be retrieved is not locally available, it may take a relatively long time for the data to be retrieved. Therefore, it is possible for another instruction (i.e., an instruction younger than the first instruction) to write data to the same register after the first instruction has executed but before the data retrieved by the first instruction is written to the register. In such a case, the data written to the register by the second instruction may be overwritten by the data retrieved by the first instruction. As a result, the register may contain incorrect data, and an error may result when a later instruction uses the data in the register. Therefore, a write-after-write data hazard exists when data produced from the execution of an older instruction may overwrite data produced from the execution of a younger instruction.
To prevent errors from read-after-write data hazards and from write-after-write data hazards, most out-of-order parallel processors employ a control mechanism. In this regard, during the execution of each instruction, the control mechanism determines whether an instruction being processed (referred to hereafter as the xe2x80x9cpending instructionxe2x80x9d) requires data produced by the execution of an older instruction. If so, the control mechanism then determines whether the older instruction has been processed, at least to the point where the needed data is available. If this data is not yet available, the control mechanism stalls (i.e., temporarily stops) processing of the pending instruction until the necessary data becomes available, thereby preventing errors from read-after-write data hazards.
In addition, the control mechanism also determines whether data from (i.e., generated or retrieved by) an older instruction is to be written to the same register or memory location as the data from a pending instruction. If so, the control mechanism stalls the pending instruction until the data from the older instruction has been written to the register or memory address, thereby preventing errors from write-after-write data hazards. Consequently, the control mechanism may stall the pending instruction in order to prevent errors from either read-after-write data hazards or from write-after-write data hazards.
Stalling of the pending instruction is usually accomplished by asserting a stall signal transmitted to the pipeline executing the pending instruction. In response to the asserted stall signal, the pipeline is designed to stop execution of the pending instruction until the stall signal is deasserted by the control mechanism. Once the read-after-write data hazard or the write-after-write data hazard no longer exists, the control mechanism deasserts the stall signal, and in response, the pipeline resumes processing of the pending instruction. The control mechanism required to detect and prevent potential errors from read-after-write data hazards and from write-after-write data hazards is relatively complex in out-of-order processors, and as the number of pipelines is increased, the complexity of the control mechanism increases dramatically.
Consequently, many conventional parallel processors, particularly processors having a large number of pipelines, employ an in-order type of processing in lieu of the out-of-order type of processing described above. In in-order processing, the instructions being processed by the different pipelines are stepped through the stages of the pipelines on certain edges of a system clock signal. In this regard, the processing of instructions in a pipeline is usually divided into stages, and each stage of the pipeline simultaneously processes a different instruction.
As an example, the processing performed by each pipeline may be divided into a register stage, an execution stage, a detect exceptions stage, and a write stage. During the register stage, any operands necessary for the execution of an instruction are obtained. Once the operands have been obtained, the processing of the instruction enters into the execution stage in which the instruction is executed. After the instruction has been executed, the processing of the instruction enters into a detect exceptions stage in which conditions, such as overruns during execution, for example, that may indicate data unreliability are checked. After the detect exceptions stage is completed, a write stage is entered in which the results of the execution stage are written to a register.
A key feature of in-order processing is that each instruction of an issue group steps through each stage at the same time. An xe2x80x9cissue group,xe2x80x9d as defined herein, is a set of instructions simultaneously (i.e., during the same clock cycle) processed by the same stage of different pipelines within a single processor. As an example, assume that each stage of each pipeline processes one instruction at a time, as is typically done in the art. The instructions in the detect exceptions stage of the pipelines form a first issue group, and the instructions in the execution stage of the pipelines form a second issue group. Furthermore, the instructions in the register stage of the pipelines form a third issue group. Each of the issue groups advances into the next respective stage in response to an active edge of the system clock signal. In other words, the first issue group steps into the write stage, the second issue group steps into the detect exceptions stage, and the third issue group steps into the execution stage in response to an active edge of the system clock signal.
As used herein, an xe2x80x9cactive edgexe2x80x9d is any edge of the system clock signal, the occurrence of which induces each unstalled instruction in a pipeline to advance to the next stage of processing in the pipeline. For example, assume that a processor is designed to step each unstalled instruction into the next stage of processing every three clock cycles. In this example, the active edges could be defined as every third rising edge of the clock signal. It should be noted that which edges of the clock signal re designated as xe2x80x9cactive edgesxe2x80x9d is based on design parameters and may vary from processor to processor.
During in-order processing, any instruction in one issue group preferably does not pass another instruction in another issue group. In other words, instructions of one issue group input into the pipelines after the instructions of another issue group are prevented from entering into the same stage processing the instructions of the other issue group. Therefore, at any point in time, each stage of the pipelines is respectively processing instructions from only one issue group. Since instructions from different issue groups are prevented from passing each other, the control mechanism for controlling the pipelines and for preventing errors from read-after-write data hazards and from write-after-write data hazards is greatly simplified and is, therefore, often preferable over out-of-order processing.
In both out-of-order processing and in-order processing, certain inefficiencies exist with respect to write-after-write data hazards. As set forth hereinabove, a write-after-write data hazard exists when a pending instruction and an older instruction write to the same register or other memory location. Usually, the older instruction is writing to the register so that another instruction (hereafter referred to as the xe2x80x9cintervening instructionxe2x80x9d) between the older instruction and the pending instruction can read the data produced or retrieved by the older instruction. However, due to branches in the program at run time or other reasons, there may be no intervening instruction (i.e., no instruction between the older instruction and the pending instruction) that actually utilizes the data produced or retrieved by the older instruction. Consequently, the data retrieved by the older instruction is not useful. However, to prevent errors from write-after-write data hazards, the processor stalls the pending instruction until the data retrieved or produced by the older instruction is written to the register. This results in undesirable inefficiencies, since the processor must wait on data that will not be used by the processor to execute the program.
Thus, a heretofore unaddressed need exists in the industry for providing a system and method of increasing the efficiency of parallel processors in preventing write-after-write data hazards.
The present invention overcomes the inadequacies and deficiencies of the prior art as discussed hereinbefore. Generally, the present invention provides a system and method for efficiently preventing errors caused by write-after-write data hazards.
In architecture, the processing system of the present invention utilizes a plurality of pipelines and a control mechanism. The plurality of pipelines receives and processes instructions of a computer program that includes a first instruction and a second instruction. The control mechanism is designed to detect a write-after-write data hazard associated with the first instruction and the second instruction, when the first and second instruction are configured to cause data to be written to the same location. After detecting the write-after-write data hazard, the control mechanism determines whether there is an intervening instruction (i.e., an instruction between the first and second instructions) that is dependent on the data produced or retrieved by execution of the first instruction. If there is no such intervening instruction, the control mechanism cancels the first instruction by transmitting a cancellation request.
In accordance with another feature of the present invention, a memory interface receives the cancellation request and, in response to the cancellation request, either stops searching for the data requested by execution of the first instruction or refrains from transmitting the data to the aformentioned location.
In accordance with another feature of the present invention, the control mechanism stalls the second instruction in response to the write-after-write data hazard and removes the stall of the second instruction when the first instruction is canceled.
The present invention can also be viewed as providing a processing method for efficiently processing instructions of computer programs. The method can be broadly conceptualized by the following steps: receiving a plurality of instructions from a computer program, the instructions including a first instruction and a second instruction; detecting a write-after-write data hazard associated with the first instruction and the second instruction; stalling the second instruction in response to the write-after-write data hazard; determining whether another instruction within the plurality of instructions is dependent on the first instruction; detecting, in the determining step, an absence of instructions within the plurality of instructions that are dependent on the first instruction; canceling the first instruction in response to the detecting an absence step; and processing the second instruction in response to the canceling step.
Other features and advantages of the present invention will become apparent to one skilled in the art upon examination of the following detailed description, when read in conjunction with the accompanying drawings. It is intended that all such features and advantages be included herein within the scope of the present invention and protected by the claims.