1. Technical Field
The present invention relates generally to information processing systems and, more specifically, to the maintaining and forwarding of ready state information to instruction scheduler logic.
2. Background Art
The performance of many microprocessors is a function of, among other things, core clock frequency and the amount of instruction level parallelism (ILP) that can be derived from application software executed by the processor. ILP is the number of instructions that may be executed in parallel within a processor microarchitecture. In order to achieve a high degree of ILP, microprocessors may use large scheduling windows, high scheduling bandwidth, and numerous execution units. Larger scheduling windows allow a processor to more easily reach around blocked instructions to find ILP in the code sequence. High instruction scheduling bandwidth can sustain the instruction issue rates required to support a large scheduling window, and more execution units can enable the execution of more instructions in parallel.
Although large scheduling windows may be effective at extracting ILP, the implementation of these larger windows at high frequency is challenging. A scheduling window includes a collection of unscheduled instructions that may be considered for scheduling in a given cycle, and also includes associated tracking logic. The tracking logic maintains ready information (based on dependencies) for each instruction in the window. Instructions in the scheduling window may be surpassed in a given cycle if all dependencies for the instruction have not yet been resolved.
Large scheduling windows can imply relatively slow select and wakeup logic within an instruction scheduler (also referred to herein as “instruction scheduler logic”). For instance, a traditional large scheduling window includes logic to track incoming tag information and to record ready state information for unscheduled instructions. An example of a prior art scheduling model that uses a large scheduling window 110 is set forth in FIG. 1. Tracking incoming tag information and recording ready state information for all entries in a large scheduling window 110 can be inefficient in that, typically, only a small subset of all scheduling window entries require such processing at a given time. Another issue with typical large window 110 schemes is power consumption. For instance, many traditional large scheduling windows 110 are implemented with abundant content addressable memory (CAM) logic to track tag and ready state information for unscheduled instructions. The CAM requires significantly more power than other types of memory, such as standard RAM (random access memory). The embodiments of an efficient lossy instruction scheduling scheme described herein address these and other concerns related to traditional implementation of large scheduling windows.