A read-after-write conflict occurs in a pipelined processor when an instruction writes to a register or a status bit, and a succeeding instruction attempts to read from the register or status bit before the register or the status bit has actually been updated. Consider the following examples based on a pipelined with 10 stages and the StarCore assembler language. A typical read-after-write conflict occurs where an instruction I2 attempts to read from a register R1 before an instruction I1 writes to the register R1:
I1: MOVE, L D1, R1; Write R1 at stage S5
I2: MOVE, L (R1), R2; Read R1 at stage S3
When the instruction I2 reaches stage S3 and is ready to read the register R1, the instruction I1 is only located at stage S4. The register R1 does not become available to read until a cycle after the instruction I1 is executed and the data D1 is written to the register R1. Therefore an interlock mechanism of the processor will stall the instruction I2 for two cycles until the data D1 is in the register R1 and available to read.
Referring to FIG. 1, a diagram illustrating a conventional single cycle stall is shown. In the example, the instructions in conflict are a write instruction I1 (i.e., MOVE,L D1,R1) and a read instruction I3 (i.e., MOVE,L (R1),R2). An additional instruction I2 (i.e., ADD D4,D5) exists between the instructions I1 and I2. The instructions I1 to I3 progress through the early stages of the pipeline as normal in cycles 1 through 4. In cycle 5, the instruction I3 reaches stage S3 and is ready to read the register R1 and the instruction I1 writes to the register R1 from stage S5. Since the register R1 will not have the data D1 calculated by the instruction I1 stored within until a cycle after the write by the instruction I1, execution of the instruction I3 is delayed until cycle 6 before reading from the register R1. Therefore, the interlock mechanism generates a single cycle stall of the instruction I3. The number of cycles for which a read instruction is stalled depends on what stage the register is being read from, at what stage the register is being written to, and the distance between the two instructions that cause the conflict.
Referring to FIG. 2, a diagram illustrating a conventional three cycle stall is shown. In the example, the instructions in conflict are a write instruction I1 (i.e., CMPEQ D0,D1) to a bit T and a read instruction I4 (i.e., IFT ADDA R6,R7) of the bit T. Another write instruction I3 (i.e., CMPEQA R0,R1) that writes to the bit T resides between the instructions I1 and I4. An instruction I2 (i.e., IFT ADD D6,D7,D8) that reads the bit T is located between the instructions I1 and I3.
In cycle 4, the instruction I4 reaches stage S4 and the instruction I1 reaches stage S7. Since the instruction I1 does not write until stage S9 in cycle 6, a three-cycle stall is generated by the interlock mechanism. The three-cycle stall delays the instruction I4 from reading the bit T until cycle 7. The interlock mechanism does not account for the instruction I3 writing to the register R1 from stage S3 in cycle 2. The interlock mechanism generates the three-cycle stall regardless of the presence or absence of the write instruction I3.