1. Field of the Invention
Embodiments of the present invention generally relate to memory subsystems and, more specifically, to improvements to such memory subsystems.
2. Description of the Related Art
Memory circuit speeds remain relatively constant, but the required data transfer speeds and bandwidth of memory systems are increasing, currently doubling every three years. The result is that more commands must be scheduled, issued and pipelined in a memory system to increase bandwidth. However, command scheduling constraints that exist in the memory systems limit the command issue rates, and consequently, limit the increase in bandwidth.
In general, there are two classes of command scheduling constraints that limit command scheduling and command issue rates in memory systems: inter-device command scheduling constraints, and intra-device command scheduling constraints. These command scheduling constraints and other timing constraints and timing parameters are defined by manufacturers in their memory device data sheets and by standards organizations such as JEDEC.
Examples of inter-device (between devices) command scheduling constraints include rank-to-rank data bus turnaround times, and on-die-termination (ODT) control switching times. The inter-device command scheduling constraints typically arise because the devices share a resource (for example a data bus) in the memory sub-system.
Examples of intra-device (inside devices) command-scheduling constraints include column-to-column delay time (tCCD), row-to-row activation delay time (tRRD), four-bank activation window time (tFAW), and write-to-read turn-around time (tWTR). The intra-device command-scheduling constraints typically arise because parts of the memory device (e.g. column, row, bank, etc.) share a resource inside the memory device.
In implementations involving more than one memory device, some technique must be employed to assemble the various contributions from each memory device into a word or command or protocol as may be processed by the memory controller. Various conventional implementations, in particular designs within the classification of Fully Buffered DIMMs (FBDIMMs, a type of industry standard memory module) are designed to be capable of such assembly. However, there are several problems associated with such an approach. One problem is that the FBDIMM approach introduces significant latency (see description, below). Another problem is that the FBDIMM approach requires a specialized memory controller capable of processing the assembly.
As memory speed increases, the introduction of latency becomes more and more of a detriment to the operation of the memory system. Even modern FBDIMM-type memory systems introduce 10 s of nanoseconds of delay as the packet is assembled. As will be shown in the disclosure to follow, the latency introduced need not be so severe.
Moreover, the implementation of the FBDIMM-type memory devices required corresponding changes in the behavior of the memory controller, and this FBDIMMS are not backward compatible among industry-standard memory system. As will be shown in the disclosure to follow, various embodiments of the present invention may be used with previously existing memory controllers, without modification to their logic or interfacing requirements.
In order to appreciate the extent of the introduction of latency in an FBDIMM-type memory system, one needs to refer to FIG. 1. FIG. 1 shows an FBDIMM-type memory system 100 wherein multiple DRAMS (D0, D1, . . . D7, D8) are in communication via a daisy-chained interconnect. The buffer 105 is situated between two memory circuits (e.g. D1 and D2). In the READ path, the buffer 105 is capable to present to memory DN the data retrieved from DM (M>N). Of course in a conventional FBDIMM-type system, the READ data from each successively higher memory DM must be merged with the data of memory DN, and such function is implemented via pass-through and merging logic 106. As can be seen, such an operation occurs sequentially at each buffer 105, and latency is thus cumulatively introduced.
As the foregoing illustrates, what is needed in the art is a memory subsystem and method that overcome the shortcomings of prior art systems.