A basic problem occurring in digital system design is that of how to speed-up throughput and reduce the delays involved in providing access to memory data and instructions. The performance of the system is dependent on the best or higher speed of access to memory data and thus is reduced by the liability of any delays that a processor would have to access data or instructions.
Typically, one technique to reduce memory cycle time is that of using a cache memory which is located adjacent to the processing unit. The adjacent cache memory has generally a high-speed fast memory data access cycle and functions to hold the more frequently used data so that it will be readily available to the processing unit.
In microprocessor performance there is a fine line that needs to be drawn between the amount of on-chip and off-chip cache. The choice is between sacrificing real estate on a microprocessor for cache (which decreases microprocessor functionality), and the performance hit due to the time taken to access off-chip memory when a "miss" occurs (i.e. when the information being looked for is not in the on-chip cache). Typically, to have to go off-chip to memory requires as much as 100 clock cycles.
What is needed is a method and apparatus that decreases the amount of time required for off-chip cache access.