Processing speed, also known as processing horsepower, is a primary concern in the design and commercial success of a processor. The personal computer is a prime example of how processing speed has become a critical feature in the eyes of the consumer. Consumers expect advertised processor speeds, often measured in terms of the processor clock rate, to increase on an annual, or even semi-annual, basis. Moreover, today's applications require processors with much greater horsepower than just a few years ago. For example, computer games and applications, such as word processors and databases, designed for a computer today are often not able to execute in a useful way on the slower processors of only a few years ago. Additionally, as software developers continue to add more features to existing applications, processor horsepower needs to increase accordingly so that the user experience remains constant. As a result, identifying techniques to increase processor speed is an ever-present goal of the processor designer and manufacturer.
To create feature-rich operating systems and applications that will be successful in the marketplace, most of today's computer software is written for 32-bit processors, i.e., processors whose address space is indexed using 32 bits. Processors architected for 32-bit addressing have numerous advantages over their 16-bit predecessors, including the ability to support larger program memory requirements and the ability to support more complex instructions that can perform multiple functions in a single clock cycle. However, because consumers expect to be able to use existing, or legacy, applications on a newly purchased computer, typical 32-bit processors are designed to support both 32-bit addressing and the legacy 16-bit addressing. The need to support the legacy 16-bit addressing places an additional burden on the processor designer who is attempting to increase the speed of the 32-bit processor. This is especially true with regards to address generation as the address generator is a key component affecting processor speed and the additional logic needed to support 16-bit addressing increases the critical-path delay of the address generation circuit. The increase in the critical-path delay results in a reduction in processor speed during the execution of 32-bit software.
Fortunately, as operating systems and applications have been migrating from 16 bits to 32 bits, the number of legacy 16-bit programs in active use has dwindled considerably. Additionally, the speed of today's 32-bit processor has improved considerably as compared to the state-of the-art 16-bit processors of several years ago. Thus, the 16-bit address generation logic need not be implemented as efficiently as in the past to still achieve substantially equivalent program execution performance.
Furthermore, processors are beginning to incorporate mechanisms to support aggressive, out-of-order instruction execution with data speculation. Such processors are typically capable of executing multiple program threads in parallel. Software compilers for such processors may speculate as to how to organize the code to execute in these parallel threads to achieve as efficient execution as possible. However, the speculation may not always be correct as it is often difficult, if not impossible, to determine the complete program execution flow a-priori. For example, conditional execution programming constructs (e.g., an if-then-else statement) may determine which of several possible code segments is executed at run time. Moreover, two or more threads executing in parallel may need to access the same data variable, resulting in a data dependency. If one or more of these threads accesses the data variable out-of-sequence with respect to the overall program execution flow, a data dependency violation may occur. Thus, a processor supporting out-of-order instruction execution needs to have a mechanism for recovering from incorrect instruction execution, e.g., due to a misspeculation based on conditional execution of an unexpected code segment, a data dependency violation, etc. This recovery mechanism typically includes address correction logic that allows the processor to recompute and/or replace one or more address components of a previously executed instruction prior to rescheduling the instruction for re-execution.