A number of architectures are available for interconnecting processors with memory devices. As a simple example, a processor may be directly connected with memory devices over a conventional memory bus. In modern high performance computer systems, memory may accessed through a memory controller, and the memory devices may be mounted on sub-assemblies (Dual In-line Memory Modules, or DIMMs) which themselves include a memory buffer in addition to the individual memory devices.
One such architecture is described in detail in a proposed JEDEC (Joint Electrical Device Engineering Council) Standard entitled FB-DIMM Draft Specification, jointly published in March 2005 by the JEDEC Solid State Technology Association, and EIA (Electrical Industries Alliance).
In this specification, a memory architecture is described which is based on very high speed serial links joining fully buffered DIMMs (FBDs) in a daisy chain arrangement to a host as illustrated in FIG. 1.
FIG. 1 shows a memory system 100, comprising a host 102 connected to a first FBD 104 over serial links 106. If the memory system contains more than one FBD (as shown in FIG. 1), the first FBD 104 is connected to a second FDB 108 over serial links 110. Additional FBDs may be chained with serial links 112 in a daisy chain fashion, until a last FBD 114 is reached. A clock buffer 116 distributes a reference clock signal to the host 102 and each of the FBDs (104, 108, . . . , 114), over clock reference links 118.
Each of the FBDs (104, 108, . . . , 114) may include one or more memory devices (DRAMs 120) and an advanced memory buffer (AMB) 122.
Each of the serial links (106, 110, . . . , 112) comprises multiple upstream channels 124 (carrying formatted data frames towards the host 102) and downstream channels 126 (carrying formatted data frames and control information towards the last FBD 114). The “channels” are also referred to as “lanes” or “bit lanes” indicating that each data frame is transmitted in multiple time slots bit-serially, and striped across the lanes of a link, a technique commonly employed in a number of high speed transmission protocols.
Writing of memory data is accomplished by transmitting the formatted frames over the downstream channels 126 of the serial links (106, 110, . . . , 112), from the host 102 through one or more AMBs 122 to the memory device (DRAM) 120 that is addressed. Reading of memory data is similarly accomplished by sending a read request from the host 102 through one or more AMBs 122 to the addressed memory device (DRAM) 120 over the downstream channels 126, and subsequently transmitting the memory data from the addressed memory device (DRAM) 120 through one or more AMBs 122 over the upstream channels 124 to the host 102.
It will be appreciated that the host 102 may communicate with a DRAM 120 on any FBD, including the last FBD 114, thus transmitting through a number of AMBs 122 in series.
The required functions of the AMB 122 are described in the aforementioned JEDEC specification. They include                retrieving and regenerating the serial downstream channels 126 to the next AMB 122 in the daisy chain;        retrieving and regenerating the serial bit streams upstream to the previous AMB 122 in the daisy chain, or to the host 102 as required;        converting received downstream data to parallel for interfacing to the DRAMs 120 located on the same FBD;        converting parallel data from the DRAMs 120 located on the same FBD, to serial format for transmitting upstream; and        merging the data from the DRAMs 120 located on the same FBD, with the serial data received on the upstream channels 124 from other FBDs (located further downstream), for transmission on the upstream channels 124 toward the host 102.        
Given the high speed nature of the serial links, which may be running at 8 Gbit/s, and the physical constraints of signal transmission between devices, and the delays and variations within the devices themselves, one must expect skew between the bit lanes of each link and the reference clock 118. In addition jitter and wander occurs. To combat these effects the design of the AMB 122 must include high speed clock alignment circuitry (to align the data edges of each lane with the reference clock) and First-In-First-Out (FIFO) buffers to continuously absorb jitter and wander dynamically.
It is important for the host to memory communication to minimize the delay (latency) in order to keep the overall memory access delay low. This architecture which employs serial links (requiring serial/parallel conversions) and the daisy chaining of the links through the AMBs 122 containing dejitterizing circuits with inherent delay, presents a significant challenge in meeting a low-latency objective. Even though the links run at very high speed, host performance in terms of memory access latency may be significantly affected by the round trip delay imposed on a read operation, caused by the latency imposed by the AMB circuitry.