Contemporary computer systems commonly include a microprocessor. The microprocessor is coupled to the other components of the system by a processor bus and the microprocessor communicates with the other devices over the processor bus, such as by transferring data.
Typically, the processor bus operates at one clock frequency, and the circuitry inside the microprocessor operates internally at a much higher clock frequency. The internal microprocessor clock frequency is commonly referred to as the core clock frequency. For example, the processor bus clock frequency may be 100 MHz, whereas the core clock frequency may be 1 GHz.
It is common for the core clock frequency to be a multiple of the bus clock frequency. In the example above, the multiple, or clock multiplier, is 10. It is also common for the multiple to be a fraction, such as 15/2. Regardless of their values, the core clock frequency is typically an order of magnitude greater than the bus clock frequency. The clock multiplier may be programmed into the microprocessor during manufacture, or may be programmable.
Microprocessors typically include a cache memory. A cache memory is a relatively small memory inside the processor that stores a subset of the data in the system memory in order to reduce data access time, since accesses to the cache memory are much faster than to the system memory. Caches store data in cache lines. A typical cache line size is 32 bytes, and cache lines are arranged on cache line size memory address boundaries. When an instruction attempts to read or write data, the microprocessor checks first in the cache to see if the cache line implicated by the data address is present in the cache. If so, the instruction reads the data from or writes the data to the cache. Otherwise, the cache generates a bus request to read the data from or write the data to system memory on the processor bus.
Although the microprocessor may internally generate one or more bus requests each core clock cycle, the microprocessor can only issue one bus request on the external processor bus each bus clock cycle. Hence, during a bus clock cycle the microprocessor may internally generate many requests, depending upon the instruction sequence and the clock multiplier value. However, the microprocessor can only issue on the processor bus one of the many bus requests each bus clock cycle. The remaining bus requests must wait until the next bus clock cycle at which time the microprocessor can issue another request.
The conventional approach is to issue internally generated requests on the processor bus in program order, that is, in the order the program executing on the microprocessor generates the requests. However, the conventional approach fails to recognize that the order in which the program generates bus requests may be different from the order of urgency of the pending requests. That is, the data missing in the cache associated with one bus request may be more urgently needed than the data missing in the cache associated with a different bus request.
Therefore, what is needed is a microprocessor and method for exploiting the disparity between core clock and bus clock frequencies to issue more urgent bus requests before less urgent bus requests.