Memory subsystems are typically built with dynamic random access memory (DRAM) devices which retain data by storing electrical charges in capacitive storage cells. Writing to an address charges or discharges the cells depending on the data. Reading from an address depletes the charges, but on-chip circuitry automatically rewrites the data, restoring the cells to their pre-read values. In addition, the cells tend to discharge over time, which unchecked would lead to loss of data.
To prevent this loss of data due to discharge, DRAMs must be periodically refreshed by reading data at some location and writing it back. DRAMs generally provide an atomic refresh operation for this purpose which must be periodically performed on each row. In older DRAMs, this was done by providing a row address and asserting a refresh command. Newer DRAM devices provide an auto-refresh operation, generating a refresh address internally, requiring only that a refresh command be applied externally. During a refresh cycle, all of the DRAM is unavailable.
Similarly, after a read or write, while the charges in a cell are being restored, a DRAM is not accessible. If a multiple device memory system is configured such that adjacent addresses are on the same device, sequential accesses to those adjacent addresses must be delayed while the device is recharging. To increase bandwidth, memory units using DRAMS are interleaved, meaning that memory units are configured so that adjacent memory addresses are not in the same unit. Accessing a series of adjacent addresses will therefore not require a delay because the unit that was previously accessed and is currently recharging is not the unit accessed next.
CPU/memory systems, whether single or multiprocessor, have traditionally relied on a single common data bus connecting all CPUs, memories and other directly addressed ports and peripherals. In these conventional systems, there are in general two ways to implement a memory refresh scheme. One is to have hardware associated with individual memory modules trigger the refresh operations. When a CPU attempts to access an address on a DRAM which is in a refresh cycle, a "stall" signal must stall the system until the refresh is complete. This locks up the bus on average for one half of a refresh cycle.
The other refresh scheme is to have a bus controller schedule refreshes. This method locks up the bus for the duration of the refresh cycle.
In the traditional single data path system, only one memory unit can be accessed by only one CPU at any time. The other CPUs must wait their turn. As an alternative, a cross-bar switch can simultaneously provide multiple data paths between memory modules and CPUs and other ports. Thus all CPUs may access different memory modules simultaneously. An arbiter configures the cross-bar switch according to the needs of the CPUs, and commands the memory modules by sending transaction codes, such as read, write and refresh, to the modules via a transaction bus.
In this cross-bar switch system, where a plurality of CPUs may access a plurality of memory units at the same time, it is crucial that as few memory units as possible be unavailable due to access latency. A highly interleaved system provides a high number of interleaved units, most of which are available at any given time due to the higher number of memory units than CPUs. This greatly increases the probability that addressed memory units will be available.
All of these interleaved memory units need to be refreshed periodically. As with the traditional single data path system described above, each memory module can trigger its own refreshes, or a bus controller, in this case the arbiter, can trigger the refreshes.
In the preferred embodiment, a directory module keeps track of the current "owner" of a cache block as well as who has a copy of the cache block. The "owner" is the CPU or I/O device that has the most up-to-date copy, i.e., the last to modify the block in cache. If no modified copies exist, then the main memory is the owner. When a CPU requests a copy of a cache block, the directory directs the request to the owner of that cache block. The directory therefore maintains data coherency. The directory module does this by maintaining a "line" for every cache block. These lines are associated with the memory units in the system memory and are themselves made up of memory devices requiring periodic refresh.