In an effort to secure higher levels of system performance, microprocessor designs often employ dynamic scheduling as a technique to extract information level parallelism (ILP) from serial instruction streams. Conventional dynamic scheduler designs house a window of candidate instructions from which ready instructions are sent to functional units in an out-of-order data flow fashion. The instruction window is implemented using large monolithic content addressable memories (CAMs) that track instructions and their input dependencies. While more ILP can be extracted with larger instruction windows (and accordingly larger CAM structures), such an increase in parallelism comes at the expense of slower scheduler clock speed.
In addition to performance, power dissipation has become an increasing concern in the design of high performance microprocessors. Increasing clock speeds and diminishing voltage margins have combined to produce designs that are increasingly difficult to cool. Additionally, embedded processors are more sensitive to energy usage as these designs are often powered by batteries. The present invention was developed in light of these and other obstacles.