1. Field
The described embodiments relate to computer systems. More specifically, the described embodiments relate to techniques for handling store instructions with unknown destination addresses during speculative execution.
2. Related Art
Some modern microprocessors support speculative execution of program code. This generally involves executing instructions speculatively while preserving a pre-speculation architectural state of the processor. The processor can discard speculative results and return to the pre-speculation architectural state if certain conditions occur during speculative execution (e.g., encountering an error/trap, a coherence violation, unavailability of processor hardware resources, executing certain types of instructions, etc.). If the speculative execution completes without encountering one of the conditions, the processor can commit the speculative results to the architectural state and continue with normal, non-speculative execution.
Some of these processors support an execute-ahead mode and a deferred-execution mode for speculatively executing instructions. In these processors, upon encountering an instruction with an unresolved data dependency while executing instructions in a non-speculative normal-execution mode, the processor defers the instruction by placing the instruction into a deferred queue and marking the destination register of the instruction “not there” to indicate that the register is awaiting a result from a deferred instruction. The processor then transitions to the execute-ahead mode to speculatively execute subsequent instructions. During the execute-ahead mode, instructions with unresolved dependencies are deferred, but instructions without unresolved data dependencies are executed in program order. When a data dependency is eventually resolved (e.g., data returns from a cache), the processor can transition to the deferred-execution mode, during which instructions in the deferred queue are issued in program order for execution. In these processors, unless one of the above-described conditions is encountered, upon executing all deferred instructions, the processor can join the speculative results to an architectural state of the processor and resume execution in a normal-execution mode.
Some of these processors also support a scout mode. In these processors, upon encountering one of the above-described conditions while speculatively executing program code, the processor can transition to the scout mode and can execute program code in the scout mode until any data dependencies are resolved. The processor then restores the pre-speculation architectural state and resumes operation in the normal-execution mode. During the scout mode, the processor executes memory operations to pre-fetch data for the subsequent re-execution of the program code in normal-execution mode, but does not commit any speculative results to the architectural state of the processor. Although scout mode enables these processors to perform useful work during a stall condition, the processors are forced to re-execute the program code upon resuming the pre-speculation architectural state and resuming operation in normal-execution mode.
One of the conditions that can cause some of these processors to transition from execute-ahead mode to scout mode occurs when the processor encounters a load following a store with an unknown address. For example, processor can execute a first load instruction to load a register with a value that is then to be used to compute the address for a store instruction. Assuming the first load instruction misses in the L1 cache, the processor defers the first load instruction and sets a bit in the destination register for the first load instruction to mark the destination register as “not there.” The processor can then encounter a subsequent store instruction that uses the value in the not-there register as an input for computing a destination address for the store. However, because the register is not there, the processor cannot resolve the address for the store. Thus, the store instruction is deferred and an entry is made in the store buffer that indicates that a store operation with an unknown destination address is outstanding (i.e., has been deferred). When the processor encounters a subsequent load instruction during speculative execution, the processor immediately transitions to scout mode. As described above, the transition to scout mode means that the processor will eventually restore the checkpoint and re-execute all of the speculatively executed instructions, which can mean that the processor is duplicating a significant amount of computational work.
In order to avoid restoring the checkpoint and hence duplicating the computational work, some processors simply defer all load instructions following a store with an unknown address and continue to operate in execute-ahead mode. However, because load instructions are common in program code, this can lead to a large number of instructions being deferred (i.e., both load instructions and instructions that are dependent on deferred load instructions). Because the deferred queue in these processors is typically limited in size, forcing the deferral of the load instructions and the instructions that depend from them can cause the deferred queue to overflow, which in turn leads the processor to immediately restore the pre-speculation architectural state and resume operation in the normal-execution mode (thereby discarding all speculatively-performed computational work).