The present invention relates generally to computer memory, and more specifically, to early data delivery prior to error detection completion in a memory system.
Contemporary high performance computing main memory systems are generally composed of one or more memory devices, which are connected to one or more memory controllers and/or processors via one or more memory interface elements such as buffers, hubs, bus-to-bus converters, etc. The memory devices are generally located on a memory subsystem such as a memory card or memory module and are often connected via a pluggable interconnection system (e.g., one or more connectors) to a system board (e.g., a PC motherboard).
Overall computer system performance is affected by each of the key elements of the computer structure, including the performance/structure of the processor(s), any memory cache(s), the input/output (I/O) subsystem(s), the efficiency of the memory control function(s), the performance of the main memory devices(s) and any associated memory interface elements, and the type and structure of the memory interconnect interface(s).
Extensive research and development efforts are invested by the industry, on an ongoing basis, to create improved and/or innovative solutions to maximizing overall system performance and density by improving the memory system/subsystem design and/or structure. High-availability systems present further challenges as related to overall system reliability due to customer expectations that new computer systems will markedly surpass existing systems in regard to mean-time-between-failure (MTBF), in addition to offering additional functions, increased performance, increased storage, lower operating costs, etc. Other frequent customer requirements further exacerbate the memory system design challenges, and include such items as ease of upgrade and reduced system environmental impact (such as space, power and cooling). In addition, customers are requiring the ability to access an increasing number of higher density memory devices (e.g., DDR3 and DDR4 SDRAMs) at faster and faster access speeds.
Typically, requirements of high reliability and high performance place different and often conflicting constraints on memory controller design in memory systems. One task of a memory controller is to return a block of data from system memory to a cache subsystem. The data can be delivered from memory on multiple high-speed memory channels, each of which breaks up the block into sub-blocks, or frames. Delivery of the block can be optimized to proceed in an uninterrupted fashion to the cache subsystem. Two levels of data error detection and correction can exist. A first level is data error detection on a per-channel basis to detect when a transmitted frame has corrupted data. A second level is symbol error correction with a redundant memory channel of a redundant array of independent memory (RAIM) system to support recovery from failures of either DRAM chips or an entire channel. In RAIM, data blocks are striped across the channels along with check bit symbols and redundancy information. Examples of RAIM systems may be found, for instance, in U.S. Patent Publication Number 2011/0320918 titled “RAIM System Using Decoding of Virtual ECC”, filed on Jun. 24, 2010, the contents of which are hereby incorporated by reference in its entirety, and in U.S. Patent Publication Number 2011/0320914 titled “Error Correction and Detection in a Redundant Memory System”, filed on Jun. 24, 2010, the contents of which are hereby incorporated by reference in its entirety.
If a data error is detected on a memory channel after data block transfers to the cache subsystem have begun, then the memory controller can use the redundant memory channel's data to correct data on-the-fly. This mechanism is typically employed where the associated logic is all contained in a single memory clock domain, and each of the memory channels is synchronized with each other with respect to the blocks of data being returned from memory. However, in systems that include multiple clock domains that are asynchronous with a potential for varying frequency relationships between clock domains, timing issues can arise where data are available to send to the cache subsystem before error checking is complete. Without fixed clock relationships, synchronous error detection is infeasible. Inexact timing of error detection between channels and block data transfers to the cache subsystem can adversely impact memory latency and system performance.