This invention relates to memory system for computers, and more particularly to a method for buffering data for sequential read requests in a memory system.
As the speed of processors increases, the need for fast memory systems becomes more important. For example, a high speed RISC processor of the type disclosed in copending application Ser. No. 547,630, filed Jun. 29, 1990 now pending, assigned to Digital Equipment Corporation, may be constructed to operate at a CPU cycle time of 5-nsec or less, and execute an instruction during each cycle (due to the RISC concepts implemented). If the main memory (usually composed of DRAMs) has a cycle time of 300-nsec, for example, it can be calculated that the CPU could spend much of its time waiting for memory, even using a cache with typical cache hit rates. In efforts to bring the memory performance more in line with the CPU, the cache memory is made hierarchical, providing primary, secondary, and, in some cases, third level caches, and of course the speed of the cache memories is increased as much as is economical. In addition, the bandwidth of the memory bus is increased, as by using a wider data path. Nevertheless, efforts are still needed to reduce the amount of time the CPU spends waiting on memory, to achieve acceptable performance for these high-speed CPUs.
When caching is employed, read accesses to main memory are most often for fetching an entire cache line, and it is preferable to make the memory data path equal to the width of a cache line or a submultiple of a cache line. The principal of locality suggests that cache lines will often be accessed in sequence, and, when two sequential cache lines are accessed, there is a reasonable probability that the sequence will be continued. One of the features of this invention is to take advantage of this observation in order to increase system performance.
In constructing main memory for typical computers, the most widely used device is the MOS DRAM or dynamic RAM. These devices have access times of perhaps 70-ns, but cycle times are much longer, perhaps 200-ns or more. However, most DRAMs now commercially available have a feature called "page mode" in which the column address can be changed after a row access to the DRAM array, producing a sequence of data outputs at a faster rate, so long as the new column addresses are in the same "page." To invoke page mode operation, the row address strobe or RAS signal applied to the DRAM is held in the asserted condition, and the column address strobe or CAS is toggled; a new column address is asserted each time CAS is reasserted. This mode of operation is about twice as fast as standard RAS-CAS reads, so if this mode can be advantageously employed, then the average access time can be reduced.
The advantages obtained by use of various features of the invention include providing faster access to sequential data located in memory modules installed on a multi-node memory bus. By taking advantage of the fast page mode capabilities of dynamic random access memory (DRAM) devices, the method of the invention allows for detection of sequential memory access, and, in response, prefetches memory data from the next sequential location in advance of the actual request for that data by the host computing system, placing the data in a high-speed memory device. As a result, when the host computing system requests the next piece of memory data (usually a cache line), the data can be delivered to the host computing system much faster than if the data had to be delivered directly from the DRAMs of the memory module.
An important feature of one embodiment is the actual location of the stream buffer on the memory module itself, rather than upstream. By placing the stream buffer memory on the memory module, filling the stream buffers can be done without utilizing the system bus (shared with other resources), thereby conserving system memory interconnect bandwidth and throughput. Also, filling the stream buffers can be done using the fast page mode operation of the DRAM devices, a significant performance advantage. Finally, by placing the stream buffer memory within the logic domain covered by the memory module error detection and correction logic, the reliability, availability, and data integrity is enhanced.