1. Field
This disclosure relates generally to microprocessor architecture, and more specifically, to load/store execution unit configurations for a microprocessor operating in single and multi-thread modes.
2. Related Art
Various processor designers have attempted to increase on-chip parallelism through superscalar techniques, which are directed to increasing instruction level parallelism (ILP), and multi-threading techniques, which are directed to exploiting thread level parallelism (TLP). A superscalar architecture attempts to simultaneously execute more than one instruction by fetching multiple instructions and simultaneously dispatching them to multiple (sometimes identical) functional units of the processor. Superscalar processors differ from multi-core processors in that the functional units in the superscalar processor are not usually entire processors. A typical multi-threading operating system (OS) allows multiple processes and threads of the processes to utilize a processor one at a time, usually providing exclusive ownership of the processor to a particular thread for a time slice. In many cases, a process executing on a processor may stall for a number of cycles while waiting for some external resource (for example, a load from a random access memory (RAM)), thus lowering efficiency of the processor. Simultaneous multi-threading (SMT) allows multiple threads to execute different instructions in the same clock cycle, using functional units that another executing thread or threads left unused.
In multi-threading processors, it is desirable to improve the number of instructions per cycle not only when executing multiple threads, but also when executing a single thread.