Field of the Invention
The present invention relates generally to processors, and in particular to methods and mechanisms for packing multiple iterations of a loop in a loop buffer.
Description of the Related Art
Modern day processor systems tend to be structured in multiple stages in a pipelined fashion. Typical pipelines often include separate units for fetching instructions, decoding instructions, mapping instructions, executing instructions, and then writing results to another unit, such as a register. An instruction fetch unit of a microprocessor is responsible for providing a constant stream of instructions to the next stage of the processor pipeline. Typically, fetch units utilize an instruction cache in order to keep the rest of the pipeline continuously supplied with instructions. The fetch unit and instruction cache tend to consume a significant amount of power while performing their required functions. It is a goal of modern microprocessors to reduce power consumption as much as possible, especially for microprocessors that are utilized in battery-powered mobile devices.
In many software applications, the same software steps may be repeated many times to perform a specific function or task. In these situations, the fetch unit will continue to fetch instructions and consume power even though the same loop of instructions are continuously being executed. If the loop could be detected and cached in a loop buffer, then the fetch unit could be shutdown to reduce power consumption while the loop executes. However, it may be challenging to maximize instruction throughput in the processor pipeline while the loop buffer is being used. This may result in the processor operating at less than full efficiency.