The present invention relates, in general, to the field of integrated circuit (xe2x80x9cICxe2x80x9d) dynamic random access memory (xe2x80x9cDRAMxe2x80x9d) devices. More particularly, the present invention relates to a packet-based DRAM memory device incorporating an on-chip row register cache which is functional to reduce overall data access latencies, especially with respect to xe2x80x9cpage missesxe2x80x9d.
A new type of volatile random access memory devices has been recently introduced which uses low pin count interfaces operating at high clock rates to multiplex memory control, address, and data in and out of the chip. These so called xe2x80x9cprotocol-basedxe2x80x9d or xe2x80x9cpacket-basedxe2x80x9d memories have the benefit of delivering high potential bandwidth in a low-pin count single chip IC package. This approach is particularly interesting for small systems containing just a single processor component and a single memory device.
The Rambus(copyright) DRAM (xe2x80x9cRDRAM(trademark)xe2x80x9d trademarks of Rambus, Inc., Mountain View, Calif.) was the first of several proposed packet-based DRAM devices. The most current version of this product was developed in conjunction with Intel Corporation, Santa Clara, Calif. and is called the Direct Rambus DRAM (or xe2x80x9cDRDRAMxe2x80x9d). See for example, Rambus(copyright) Technology Overview, Rambus, Inc., Aug. 23, 1999 and Direct RDRAM(trademark) Advance Information 64/72-Mbit (256Kxc3x9716/18xc3x9716d) Rambus, Inc. Aug. 3, 1998, the disclosures of which are specifically incorporated herein by this reference. The Direct RDRAM has been optimized to allow concurrent command, address, and data packets to be transferred to improve the efficiency of the bus interface.
Nevertheless, the DRDRAM presents several operational limitations which prevent its optimum performance and cost effectiveness. Firstly, the DRDRAM architecture imposes significantly larger chip sizes than are found in traditional DRAM components. This size increase results from the need to multiplex and demultipex data and addresses at the bus interface. Specifically, the current DRDRAM embodiment has a relatively complex eight way multiplexer and demultiplexer interface to the external data bus. This level of multiplexing is determined by the external data bus size and pipelined data speed of the core DRAM memory banks. The 18 bit external data bus is specified at an 800 MHz data rate and the DRAM core must deliver a 1.6 GB/sec. bandwidth. Current DRAM cores can deliver a new data-word every 10 ns or a 100 MHz data rate. For this core, the internal DRAM bus must be eight times 18 bits (or 144 bits) to deliver the specified data rate.
Secondly, multiplexing address and data buses increases random access latency compared to synchronous DRAM (xe2x80x9cSDRAMxe2x80x9d). At 800 MHz, address packet delays are 10 ns and data packet delays for a 64 bit equivalent word are 5 ns. Consequently, every SDRAM random access parameter is degraded by 15 ns in Direct RDRAM.
Thirdly, standard DRAM core exhibits relatively long latency on same bank xe2x80x9cpage missesxe2x80x9d which reduce bus efficiency. The standard DRAM core uses page mode operation, which means that data is held in the DRAM sense amplifiers during random access within a page. If a request for another page in the same bank occurs, the DRAM must precharge and then another row must be randomly accessed into the sense amps. This xe2x80x9cpage missxe2x80x9d can take on the order of 70 ns in current DRAM technology. A xe2x80x9cpage missxe2x80x9d greatly reduces bus efficiency and delivered bandwidth. The maximum bandwidth for the device is equal to four data words (64 bit) at 5 ns/data word, which is 20 ns for 32 bytes, or 1600 MB/sec. On the other hand, the worst case bandwidth (in the case of a xe2x80x9cpage missxe2x80x9d, Read-to-Read) is 77.5 ns (xe2x80x9cpage missxe2x80x9d) plus three data word (64-bit) times at 5 ns/data word which equals 92.5 ns for 32 bytes or 338 MB/sec. Thus, it can be seen that Direct RDRAM bus efficiency is reduced from 100% to 21% under continuous random xe2x80x9cpage missesxe2x80x9dwhile delivered bandwidth is reduced from 1600 MB/Sec to 338 MB/Sec.
Enhanced Memory Systems, Inc., a subsidiary of Ramtron International Corporation, Colorado Springs, Colorado and assignee of the present invention, has long been a pioneer in defining low latency, high efficiency DRAM core architectures based on its proprietary EDRAM(copyright) core technology (EDRAM(copyright) is a registered trademark of Enhanced Memory Systems, Inc., Colorado Springs, Colo.) See for example, U.S. Pat. Nos. 5,699,317, 5,721,862, and 5,887,272, the disclosures of which are specifically incorporated herein by this reference, and which disclose certain implementations of the application of this technology to standard DRAM architectures.
Disclosed herein are extensions of this EDRAM technology implemented to enhance packet-based DRAM architectures, such as Direct RDRAM, to reduce the initial device latency, reduce xe2x80x9cpage missxe2x80x9d latency and reduced chip layout overhead by reducing bus sizes and the level of required multiplexing and demultiplexing.
In accordance with an embodiment of the present invention disclosed herein, a row register (or xe2x80x9ccachexe2x80x9d) and separate write path, or bus, are integrated into each DRAM bank. This enhanced DRAM architecture, improves DRAM latency parameters and pipeline burst rate. The row register holds xe2x80x9creadxe2x80x9d data during burst reads to allow hidden precharge and same bank activation to minimize xe2x80x9cpage missxe2x80x9d latency. The faster pipelined burst rate simplifies Rambus RDRAM multiplexer/demultiplexer logic and reduces internal data bus size by 50%.
Particularly disclosed herein is a packet-based integrated circuit device comprising at least one dynamic random access memory bank having associated row and column decoders for specifying memory locations therein in response to externally supplied row and column addresses. The device includes at least one sense amplifier circuit coupled to the column decoder for reading data from the memory bank, a row register coupled to the sense amplifier circuit for retaining at least a portion of the data read out from the memory bank, a multiplexer circuit coupling the row register to an external data bus for supplying the read out data thereon and a demultiplexer circuit coupling the external data bus to the sense amplifier circuit for supplying data applied to the external data bus to the memory bank.