In typical high-performance, superscalar microprocessors, one technique to improve performance is to reduce the number of micro-operations (“uops”) to perform various microprocessor instructions by combining one or more uops into a “fused” uop that can be executed as a single uop. The term “uop” is used throughout this disclosure to describe any sub-instruction or operation into which an instruction may be decoded in order for a processor to perform the operations prescribed by the instruction.
Prior art uop fusion techniques have typically been used to combine uops generated from a single instruction. Furthermore, some prior art uop fusion techniques may un-fuse the fused uops within a processor pipeline, or otherwise before the uops can be retired and committed to processor state. Un-fusing fused uops before retirement of the corresponding instruction may reduce some of the performance benefits of uop fusion.
In either case, prior art uop fusion techniques may be inefficient in some circumstances, in terms of processor and/or computer system performance.