1. Field of the Present Invention
The present invention generally relates to cache memory systems and more particularly to method and circuit for reducing latencies associated with copyback transactions in cache memory subsystems that employ multiple byte cache lines.
2. History of Related Art
Microprocessor based computer systems are typically implemented with a hierarchy of memory subsystems designed to provide an appropriate balance between the relatively high cost of fast memory subsystems and the relatively low speed of economical subsystems. Typically, the fastest memory subsystem associated with a computer system is also the smallest and most expensive. Because the hit rate of any given cache memory subsystem is a function of the size of the subsystem, the smallest and fastest memory subsystems typically have the highest miss rate. To achieve optimal performance, many computer systems implement a copyback policy in which data written by the system's microprocessor is initially stored in the cache. The cache data is then typically written back to system memory at a later time by a memory control unit. In this manner, the number of time consuming accesses to system memory that must be made by the processor is greatly reduced. The performance enhancement achieved by a copyback cache policy comes at the cost of increased bus bandwidth required to maintain cache/system memory coherency. In addition, microprocessors are increasingly utilized in multi-tasking systems to carry out processor intensive applications that result in unprecedented cache traffic and the generation of relatively frequent cache miss transactions. Thus, performance problems arising from multiple pending cache miss events are becoming increasingly more common.
A cache miss occurs when a bus master such as the microprocessor is required to read information from or write information to a location in system memory that is not presently reproduced in the cache memory subsystem. Cache miss transactions in copyback cache architectures can have a greater latency due to the system overhead required to transfer the contents of the cache subsystem associated with the cache miss event to system memory prior to completing the pending transaction. This overhead can increase as the line size of the cache memory subsystem increases because more clock cycles will be required to fully transfer the contents of a dirty or modified cache line to an appropriate storage location before filling the cache line with the data associated with the cache miss. Unfortunately, long cache lines are frequently encountered to reduce the circuitry required to implement a cache tag RAM to take advantage of the memory reference locality and to take advantage of special multiple byte transfer cycles such as burst write and burst read cycles designed into many modem memory devices. Accordingly, it would be advantageous to provide a method and circuit to improve the efficiency with which multiple pending cache miss transactions are handled in a copyback cache architecture.