Many products, such as cell phones, laptop computer, personal digital assistants (PDA), desktop computers, or the like, incorporate one or more processors executing programs that support communication and multimedia applications. The processors need to operate with high performance and efficiency to support the plurality of computationally intensive functions for such products.
The processors operate by fetching instructions from a unified instruction fetch queue which is generally coupled to an instruction cache. There is often a need to have a sufficiently large in-order unified instruction fetch queue supporting the processors to allow for the evaluation of the instructions for efficient dispatching. For example, in a system having two or more processors that share a unified instruction fetch queue, one of the processors may be a coprocessor. In such a system, it is often necessary to have a coprocessor instruction queue downstream from the unified instruction fetch queue. This downstream queue should be sufficiently large to minimize backpressure on processor instructions in the instruction fetch queue to reduce the effect of coprocessor instructions on the performance of the processor. Also, coprocessor instructions may require more processing stages to execute than the main processor. If there are instructions that require synchronization between the two processors, such a disparity in execution times can create performance bottlenecks. In addition, large instruction queues may be cost prohibitive in terms of power use, implementation area, and impact to timing and performance to provide the support needed for coprocessor instructions.