Traditionally, instructions are dispatched via a pipeline with instruction cache fetch and instruction decode stages. Variable-width instructions, such as those used in x86 processors, incur considerable extra hardware complexity for high-bandwidth, multiple-instruction-per-cycle decoding compared to fixed-length instructions. This in turn requires extra pipeline stages for instruction stream parsing and decoding, and these extra stages consume extra power and cause increased latency when the pipeline has to be restarted, such as on a taken or mispredicted branch that redirects instruction fetching. This limits overall instruction bandwidth per cycle, impacting performance. It also results in more idle pipeline stages that consume power while doing no useful work until instructions from the redirect propagate down the pipeline. A need exists to bypass these extra decode stages and streamline the service of instructions in an operation (op) cache (OC).