A typical application spends a significant amount of time in loops, and many of the loops have relatively small loop bodies. Modern processors generally include logic to detect loops; e.g., a Loop Streaming Detector (LSD) is hardware logic in the front end of a processor for detecting the presence of these frequent small loops in a stream of micro-instructions.
During normal execution, micro-operations are streamed from fetch and decode units (which may include instruction decoders (XLAT), the micro-sequencer ROM (MSROM), or the decoded streaming buffer (DSB)) through an Instruction Decode Queue (IDQ) into the back end of the processor, where the micro-operations are executed. The LSD checks whether the decoded micro-operations in the IDQ contain a loop. If a loop is detected, the micro-operations in the loop body can be streamed directly out of the IDQ. That is, rather than repeatedly streaming the iterations of the loop body from the fetch and decode units, the iterations can be dispatched directly from the IDQ, allowing the fetch and decode units to be powered down. Thus, the IDQ is treated as a loop cache to reduce power consumption in the front end. The IDQ will continue to stream micro-operations into the processor back end until one of the loop branches redirects control outside of the cached loop body.