In some modern processors, instructions have variable lengths and form a complex instruction set capable of complex tasks that may involve multiple simpler tasks, thus the term complex instruction set computers (CISC). Micro-operations, also known as a micro-ops or uops, are simpler internal instructions that can be produced by decoding the more complex instructions, also referred to as macroinstructions.
Execution pipelines are often used. Instructions are provided to the front end of the pipeline by various arrays, buffers, and caches and micro-ops are prepared and queued for execution. Such front-end arrays that contain instruction lines, may also include self-modifying code (SMC) bits to detect which instruction lines may have been overwritten by self-modifying or cross-modifying code.
For high performance processors that use these variable length instructions, the decoding process can be costly in terms of circuitry, power consumption and time. Some processors try to alleviate one or more of these costs through saving or caching the decoded micro-ops to reuse them if execution of their corresponding macroinstructions is repeated.
One technique is called a micro-op cache or microcode cache, where micro-ops are stored in cache lines (or ways) and tags associated with instruction pointers are used to lookup the micro-ops directly rather than decoding the corresponding macro-instruction each time. Some such micro-op caches are discussed, for example, in U.S. Pat. No. 6,950,903. Micro-op caches may be less costly and more power efficient than fetching and decoding macro-instructions.
It will be appreciated that for correct functionality considerations such as processor inclusion, any instruction line, for which micro-ops have been delivered into the execution pipeline may later need to be re-delivered in an unmodified state. Therefore, deallocation or eviction of the line, in particular from an instruction cache, cannot take place until all instructions from that line are no longer being processed in the execution pipeline.
One technique to protect such instruction lines from being evicted is to employ a victim cache to hold evicted lines until it can be determined that no instructions from that line are being processed in the execution pipeline. One way to make such a determination is to insert a special micro-op into the pipeline when an entry is allocated into the victim cache. As lone as new instruction fetches from the victim cache are not permitted, then when that micro-op retires in sequential order, any instructions from the evicted line that were in front of the special micro-op will have been retired as well and the corresponding entry can be deallocated from the victim cache.
Since the steps involved in decoding the variable length macroinstructions may be avoided, micro-op caches can potentially increase processor performance, but such consideration as processor inclusion, self-modifying or cross-modifying code, instruction restarts and synchronization between sequences of decoded macroinstructions and cached micro-ops can be complicated and may degrade those performance increases. To date, the range of effective techniques for employing saved or cached micro-ops to improve processing of instructions and reduce costs in terms of circuit complexity and power consumption while also handling the complicated issues of inclusion and instruction restarts in a processor have not been fully explored.