1. Field of the Invention
This invention relates to processors and, more particularly, to predicting floating-point exceptions within processors.
2. Description of the Related Art
In some pipelined processor embodiments implementing a given instruction set architecture (ISA), different instructions may be configured to execute with different latencies. For example, certain integer instructions may be configured to execute in an integer pipeline of a particular depth (i.e., a particular number of pipeline stages), while certain floating-point instructions may be configured to execute in a floating-point pipeline that is deeper than the integer pipeline. In some such embodiments, an exception condition for a longer-latency instruction may not be detected until late in the pipeline. This may create an exception hazard, in that a shorter-latency instruction issued after the longer-latency instruction may be able to complete execution and to modify architectural state before the exception caused by the longer-latency instruction is detected. Such an exception hazard may not be consistent with a precise exception model.
In some embodiments, this exception hazard may be avoided by preventing the issue of shorter-latency instructions for a number of execution cycles following the issue of a longer-latency instruction, where the number of cycles of delay is sufficient to ensure that the shorter-latency instruction will not modify architectural state until the exception condition for the longer-latency instruction has been detected. However, this solution penalizes every shorter-latency instruction regardless of whether an exception is actually generated for a given longer-latency instruction, and may unacceptably degrade processor performance.
Alternatively, the exception hazard may be avoided in some embodiments by equalizing the depth of the shorter-latency and longer-latency execution pipelines. However, this solution may substantially increase the design complexity of the processor. For example, to avoid stalling instruction issue due to dependencies on previously-issued instructions, result bypassing from each relevant pipeline stage prior to the result writeback stage may be employed. As the number of pipeline stages prior to writeback increases, the number of bypass sources also increases, which in turn requires additional die area to route and multiplex the bypass sources.