1. Field of the Invention
This invention is related to the field of processors and, more particularly, to handling of delay slots following control transfer instructions in processors.
2. Description of the Related Art
Some processor instruction set architectures (ISAs) define delayed control transfer instructions (DCTIs). A control transfer instruction may transfer program execution flow (either conditionally or unconditionally) to a target address. A DCTI transfers execution after the next instruction in the program flow (subsequent to the DCTI). The subsequent instruction is said to be in the delay slot of the DCTI, and is referred to herein as the “delay slot instruction”. The delay slot instruction may be the next sequential instruction (stored adjacent to the DCTI in memory). In some ISAs (e.g. the SPARC® ISA), a DCTI may itself be the delay slot instruction of a previous DCTI in the program execution flow. If the previous DCTI is taken, the delay slot instruction of the DCTI in the delay slot of the previous DCTI is at the target of the previous DCTI. The order of instructions, if instructions were executed one at a time, is referred to as the program order of the instructions.
DCTIs and their delay slot instructions complicate processor design. For example, if a DCTI is taken (that is, it transfers program execution flow to the target address) and instructions from the not-taken (usually sequential) execution path have been fetched, the instructions need to be flushed from the processor and the instructions from the taken execution path need to be fetched. However, the delay slot instruction may not be flushed. Thus, the delay slot instruction must be located and preserved when a taken DCTI is executed. The delay slot instruction may generally be in many places (e.g. the fetch may not have been started yet, the fetch may be in progress and the delay slot instruction may be in the process of being returned from memory, the delay slot instruction may be on the way out of the instruction cache, or may be elsewhere in the pipeline of the processor). Accordingly, locating the delay slot instruction may generally be complex.
In some ISAs, such as the SPARC® V9 ISA, DCTIs may optionally annul their delay slot instructions. An annulled delay slot instruction is not executed. For example, in the SPARC® V9 ISA, conditional DCTIs that have an annul bit set in the instruction annul the delay slot instruction if the DCTI is not taken. Unconditional DCTIs that have an annul bit sit in the instruction always annul the delay slot instruction. Again, the delay slot instruction must be located so that the annul may occur.
In fine grain multithreaded processors, each instruction in the pipeline may be from a different thread than adjacent instructions in the pipeline. That is, for a given instruction is a given pipeline stage, instructions in a pipeline stage immediately before and after the given pipeline stage may be from different threads. Having instructions from multiple threads in the pipeline may further complicate locating the delay slot instruction.