1. Field of the Invention
The present invention generally relates to complex processor pipelines. More specifically, the present invention relates to microcode implementation of run-time program translation for emulating said pipelines.
2. Description of the Related Art
A processor pipeline is a whole processing task or workload broken into smaller sub-tasks. Through the use of processor pipelining, instruction throughput (i.e., the number of instructions that can be executed in a unit of time) can be increased. Each sub-step of the overall task carries data at once and each sub-step is connected to a subsequent sub-step effectively creating links in a pipe.
In an elementary form, the processing of a computer instruction is split into a series of independent steps with a storage operation at the conclusion of each step. This allows control circuitry of a computing device to issue instructions at the processing rate of the slowest step. Even at the rate of the slowest step, the overall processing is still faster than the time required to perform all of the steps constituting the whole instruction at once. Pipelining in this manner allows multiple tasks to be executed in parallel. As a result, central processing units (CPU) and/or other logic units are kept as busy as possible as often as possible.
In this context, an ideal pipeline could be conceived with (for example) 50-stages and a 50 GHz clock rate that would allow for processing tasks at 50 billion times per second. Reality would dictate otherwise with respect to pipeline depth, however, as the code running in a processor must be programmed without margins for error or guesswork. The near constant calling of sub-routines or functions runs the risk of guessing a wrong branch thereby invalidating the incorrectly guessed workload, which would require the pipeline to refill completely thereby reducing performance. The possibility for increases with the number of pipeline stages.
It is, therefore, the nature of a complex pipelined processor that code execution is affected by current pipeline state. The pipeline state is dynamic and affected by previously executed code. In translating code for a complex pipelined processor, the rules of the pipeline must be followed to produce a correct translation.
The prior art has generally relied on one of two options to address the aforementioned constraints of complex pipelined processors, neither of which have resulted in significant success. The first option is to completely emulate the processor pipeline at all times. The second option is to use what is commonly referred to as a global analysis approach for an entire program to evaluate the dynamics of the program.
While the first solution is relatively simple, it generally results in reduced performance. The latter solution has the potential to increase performance of translated code but does so in the context of high implementation complexity and high translation cost. The global analysis method, too, may not be able to handle all cases and full pipeline emulation may be required as a fallback.
There is, therefore, a need in the art to simplify the microcode implementation of run-time program translation for emulating complex processor pipelines.