The present invention relates generally to a memory controller, and more particularly to a method and apparatus for improving the performance of a memory system.
Cache memories are typically used in computer systems to decrease the memory access time of processors thereof. A cache memory is a relatively small, high speed memory in which previously accessed information (program instructions and/or data) are stored within cache lines of the cache memory. A cache memory is typically faster than main memory by a factor of 5 to 10 and typically approaches the speed of its corresponding processor. By keeping the most frequently accessed data items in the high speed cache memories, the average memory access time of the processors will approach the access time of the cache memories.
In computer systems having multiple processors and cache memories, the gain in performance from improved memory access time is offset somewhat by the added complexity of maintaining cache coherency. Cache coherency is the method of ensuring that all of the processors in a computer system, and their associated cache memories, are using the latest data. This coherency problem specifically occurs when data stored in cache memory is modified. There are two approaches for providing processors of the computer system with modified data stored in the cache memory. The first approach utilizes a write-through cache system. In a write through cache system, modified data is written to and through a first cache memory to the main memory, and any corresponding copy of the modified data that may reside in other cache memories of the computer system is invalidated. Therefore, when processors subsequently attempt to read the modified data, the modified data is supplied from main memory for all processors whose associated cache memory contains no valid copy of the data.
The second approach utilizes a writeback cache system. In a writeback cache system, a processor and associated cache memory must first obtain ownership of a memory location before the processor may modify the memory location in its cache memory. In response to the processor and associated cache memory obtaining ownership of the memory location, the other cache memories invalidate any copy of the memory location which they may contain. After obtaining ownership of the memory location, the processor may write modified data to its associated cache memory without immediately writing the modified data to main memory. Therefore, data in a cache memory may be different than corresponding data in main memory. In order to maintain coherency, the modified data is later written back to the main memory in response to various events. For example, a writeback cache may write the data back to main memory when: a) a first processor requests the use of modified data stored in the associated cache memory of a second processor, b) a cache line having modified data needs to be replaced with a different line from main memory, or c) a periodic flush of cache memory is performed in order to prevent accidental data loss.
When multiple processors in a writeback computer system are performing operations that require frequent data modifications, the processors of the computer system may spend a large part of their processing time waiting for the completion of writebacks. An example of when frequent writebacks are likely to occur is when two processors are updating counters located in different memory locations of the same line of memory. In response to a first processor attempting to update a first counter of a memory line, a second processor may be forced to write the memory line back to the main memory, and in response to a second process attempting to update a second counter of the memory line, the first processor may be forced to write the memory line back to main memory. As a result, each successive update of a counter in the memory line may cause a cache memory to write the memory line back to main memory (i.e. to perform a writeback). These writebacks may possibly happen in an alternating fashion which would drastically decrease the performance of the system because a write to main memory takes longer than a write to cache memory. It would, therefore, be more efficient if the intermediate values of a memory location were not written back to main memory on every change. Instead, if a writeback to main memory is pending but not yet complete, successive writebacks could be collapsed, with the final value being the only value actually written back to main memory.
What is needed therefore is a method and apparatus for increasing the performance of a computer system by reducing the number of writebacks to a main memory.
In accordance with one embodiment of the present invention, there is provided a method of collapsing writebacks to a memory. For a memory having multiple memory lines and an associated memory controller, the method includes the steps of (a) storing a first address and a first modified copy of a first memory line in the memory controller; (b) issuing from a first device a read request that includes a second address; (c) storing the second address in the memory controller; and (d) transferring a second modified copy of a second memory line from a cache for a second device to the first device in response to the read request. The transferring step includes the step of replacing the first modified copy of the first memory line stored in the memory controller with the second modified copy of the second memory line if the first address and the second address both map to the same memory line in the memory.
Pursuant to another embodiment of the present invention, there is provided a computer system for implementing a writeback collapse. The computer system includes a first processor, a first writeback cache coupled to the first processor, a second processor, a memory controller, a memory coupled to the memory controller, and a first bus that couples the first writeback cache, the second processor, and the memory controller together. The memory includes several memory lines. The first writeback cache is configured to receive via the first bus a read request that was issued from the second processor. The read request includes a first address that maps the read request to a first memory line of the memory. The first writeback cache is also configured to (a) generate a response if the first writeback cache contains a first modified copy of the first memory line, and (b) transfer the first modified copy of the first memory line to the second processor if said response is generated. Furthermore, the memory controller includes a writeback storage for storing a second modified copy of a second memory line that is to be written back to the memory. The memory controller is configured to (a) receive the read request via the first bus, (b) generate a hit signal if (1) the writeback storage contains the second modified copy of the second memory line and (2) the first address and the second address map to the same memory line in the memory, and (c) replace the second modified copy of the second memory line with the first modified copy of the first memory line if the response and the hit signal are generated.
It is an object of the present invention to provide an improved method and apparatus for processing memory requests.
It is an object of the present invention to provide a new and useful method and apparatus for processing memory requests.
It is a further object of the present invention to provide a method and apparatus which reduces traffic across a memory bus.
It is yet a further object of the present invention to provide a method and apparatus which increases the performance of a computer system by reducing the number of writes to main memory.
It is yet a further object of the present invention to provide a method and apparatus which reduces the number of writes to main memory while maintaining coherency amongst multiple cache memories and main memory.
Yet another object of the present invention is to provide a method and apparatus which collapses multiple cache memory writebacks into a single write to main memory.
It is yet a further object of the present invention to provide a method and apparatus which prevents a deadlock of memory requests.
The above and other objects, features, and advantages of the present invention will become apparent from the following description and the attached drawings.