The present invention generally relates to electronic systems utilizing memory devices. More particularly, this invention relates to a method and memory architecture that enables multiple pages to be open on the same bank of a memory device.
High density random access memories (RAM) find numerous applications in computers, communications, and consumer and industrial applications. These memories are mostly volatile (lose stored data when power is switched off). These cost sensitive, medium performance (high bandwidth with moderate latency) segments have been served by multibank SDRAM's (synchronous dynamic random access memories). SDR SDRAM's, DDR-I SDRAM's, DDR-II SDRAM's, QDR SDRAM's, Rambus™ DRAM's, Network DRAM's™, FCRAM's™, and RLDRAM's™ are among the numerous varieties of DRAM available in the commercial market. In addition to DRAM, other types of memory finding use in a variety of applications include high-density flash memory products, which provide nonvolatile storage at reasonably high-densities. High-density flash memories are used in portable electronic appliances such as cell phones, digital cameras, etc.
Emerging applications in computer and communications processors (CPU's, NPU's) will increasingly require (in addition to bandwidth and fairly low latency at low cost) the ability to open new rows in the currently active banks in these SDRAM's (single-chip IC's, DIMM modules, RIMM modules, or other memory subsystems comprising multiple SDRAM's) for superior performance. Specifically, the advent of hyperthreading, multithreading, multichannel-DRAM access, shared memory and other processor inventions is making essential a different SDRAM architecture with a multiple open-page capability. However, current DRAM architectures and timings require “closing current active row/page in a bank,” before a new row/page can be “opened” in the same bank.
Current leading-edge general-purpose SDRAM's are multibank devices designated as “DDR-I” or just “DDR” (double data rate) for the current generation of devices and “DDR-II” or “DDR2” for the emerging new generation. Four internal banks are the current standard; eight-bank DDR-II devices are now being introduced. Herein, “DDR” refers generically to both DDR-I and DDR-II, unless otherwise indicated. References to SDRAM's are to all SDRAM's including but not limited to DDRs, unless otherwise indicated.
As is true of SDRAM's in general, and assuming proper memory controller support, any one or more of the internal banks can be activated by an Active command at one (but only one) row address at a given time. The row addresses can be the same or different for any or all of the banks. This basic “one open row per bank at a time” limitation arises from the inherent properties of the one-transistor, one-capacitor (1T/1C) storage cells that comprise the basic storage unit of the SDRAM core, combined with the peripheral architecture of the multibank SDRAM. If a memory subsystem has memory devices mounted in separate “ranks” (on a module) or in two (or possibly more) modules inserted into separate slots on the motherboard, the capacity of the memory is correspondingly increased and the number of open pages can be increased to a maximum of the number of ranks times the number of internal banks per device. Under that peripheral architecture, all banks have to share the same, single data path to the external world. Even with separate data I/O circuitry for each bank, it is still only one bank at a time that can accept Read or Write commands from the memory controller. The 1T/1C memory cell, which is the essence of SDRAM architecture in general for cost and density reasons, imposes the further “one open row per bank” limitation. These limitations apply to SDR SDRAM's as well as DDR SDRAM's, and other DRAM's using 1T/1C memory cell as the basic storage element. In addition to “restore after read,” a sense/read/write amplifier dedicated to precharge of bit line(s) as well as sense/restore functions aggravates the latency limitation.
In order to activate a bank for reading from or writing to one of its rows, the bank must first be charged in order to allow the chip to sense the particular row and amplify the signal from that row. Rather than waiting for a Read/Write request before being charged, a memory bank is usually precharged, initiated by a “Precharge” command. Opening a row in an SDR or DDR SDRAM activates a “page,” which is a 2-dimensional array of bits defined by the length of the row in one dimension, and the “width” of the device (being equal to the number of I/O pins in the device) in the second dimension. The length of the row is determined by the number of columns that intersect the row. In total number of bits, the size of the page is, therefore, equal to the number of column addresses multiplied by the number of I/O pins. Whenever data is read from or written to a device, the specific column number within a row address is furnished and a string of bits equal to the number of I/O pins can be retrieved from or stored in the device. The number of column addresses equals 2 to the power of the number of column address lines. This I/O is combined with I/O from or to the other devices on the memory module(s) comprising the SDRAM memory subsystem, so as to deliver the total package of bits (generally 64 bits in the case of conventional SDRAM memory subsystems, though with important exceptions) onto the memory bus in the case of a Read or store it in the module(s) in the case of a Write.
Closing of any bank is initiated by a Precharge command, which means that the command bus will be busy during the Precharge command. In SDRAM's in general, precharging introduces a delay or latency before the new bank can be given an Activate command whereby the process of opening the new bank is started. This “Precharge Latency” is often referred to symbolically as tRP, and is typically in the range of 2 to 5 clock cycles. There are numerous timing parameters that define the actual performance of SDRAM's in general and DDR-II devices in particular. The interaction of these parameters varies considerably, sometimes greatly, depending on the exact circumstances of a given access to the memory subsystem. The interaction is extremely complex, and requires many timing diagrams for a rigorous explanation. Only certain timing parameters will be discussed herein, and their explanations will be necessarily simplified and instructive only in its coverage of some of the various combinations of the timing parameters. Even if the Precharge can be hidden behind an ongoing output burst, the command bus shared by all devices will be busy, and thus prevent the issuing of other commands. Conflicting needs for commands during the same cycle are usually referred to as “command bus contention” and are a significant component of tRP and other limitations.
Following satisfaction of this latency and upon issuance of the Activate command, there begins a second latency period, often described as the “Active to Read or Write command delay,” or “RAS-to-CAS delay,” or a similar term, and is generally symbolized by tRCD. This latency is generally in the range of 2 to 5 clocks, and must be satisfied before a Read or Write command can be issued. In order to avoid command bus contention, starting with DDR-II, the Read or Write command can be issued as early as the next clock cycle following the Activate command. However, since tRCD needs to be satisfied before the execution of the Read or Write command can commence, the command needs to be pushed out internally within the device according to a predefined “Additive Latency” (AL). This mode of operation is referred to as postponed or “posted” CAS mode.
Following this comes another latency period, generally known as “/CAS Latency” (CL), a programmed value generally 2 to 4 clocks in length, which begins with the commencement of internal execution of the Read or Write command, and extends until the data is placed on the memory bus for sending to the processor in the case of a Read or written into the SDRAM in the case of a Write. In the nomenclature of the current JEDEC DDR-II Standard, and commonly in industry specification sheets also, the sum of AL and CL equals the “Read Latency” or “Read Access of the first Critical Word.” The equivalent latency parameter for Writes, “Write Latency,” equals Read Latency minus 1.
In a DDR device, depending on speed grade and other factors, the access latencies typically total in the range of 4 to 7 or more clocks to which the Precharge latencies (2 to 4 clocks) need to be added. The total is generally referred to as the “Active-to-Active command interval—Auto—Precharge,” the “bank cycle time,” or a similar term, generally symbolized as tRC. In short, the bank cycle time refers to the number of clocks required between two consecutive accesses to different rows in the same Bank, irrespective of whether they are Reads or Writes.
To be 100% efficient, an SDRAM memory subsystem would need to deliver data to the memory bus (in the case of Reads) or to the SDRAM array (in the case of Writes) on every clock cycle so as to respond fully to the processor's need to load (Read) or store (Write) data. In the case of DDR, the requirement is to deliver 2 bits of data per clock cycle at each I/O pin in the memory subsystem. If it took 9 to 12 or more clocks to deliver this quantity of data, efficiency would be near or below an unacceptable 10%. To improve on this in the case of consecutive accesses to the same bank, the bank is left activated/open after the first access, without Precharge. The first access to the bank incurs the full latencies, but accesses to the same bank thereafter can be made contiguous or nearly so by using the “burst” technique. Under this technique, just the starting column address for a burst of 2, 4 or 8 column addresses is strobed (4 or 8 for the DDR-II). The tRP and tRCD latencies are not incurred again as long as the row/page remains open, and because of the extremely high speed of internal SDRAM operation, successive column accesses equal to the length of the burst are enabled. Since only a single column address is strobed, /CAS Latency also need not be incurred again. Using this technique, once all latencies have been satisfied one time and accesses thereafter are confined to sequential column segments in the same row/page, the device can keep pace with the processor's demands for Reads and Writes and processor clock cycles are not wasted. If the device is programmed for “interleaved” burst accesses, the columns within a burst sequence are actually accessed in a numerically non-consecutive, although pre-determined, order. If the burst length is 4 or 8, as in the case of a DDR-II device, subsequent commands and their related latencies (including tRP when a Precharge command is involved) can be partially or (especially if the burst length is 8) entirely hidden behind the burst operation.
On the basis of the above, largely uninterrupted back-to-back transactions to or from the processor are possible, assuming the bursted data is fully usable by the processor. However, this assumption is not correct in many cases. Typically, a data word is 8 Bytes (64 bits) wide from a memory DIMM or RIMM. Having 8-bit bursts necessitates instruction execution on 64 Bytes of data in the SAME PAGE to take maximum advantage—a very unlikely event in most applications. Moreover, in the case of burst Writes, a Write Recovery Latency (tWR) typically of 3 clocks after the last burst Write access must be satisfied before there can be issued a subsequent Precharge command, which must precede the access of any other bank. This is a “non-hideable” 3-clock latency imposing a 6-bit interruption (in the case of DDR) per I/O pin of the data flow, a significant performance loss. Before the current leading edge DDR-II, the AC operating characteristics of the DDR SDRAM were somewhat simpler and entailed use of several fewer timing parameters, but the basic issues described above also arose in substantially the same way.
If the row/page to be newly opened is located in a different bank of the device, the time period between the Activate commands in the old and new banks is often called the “Active bank A to active bank B command period” or “Row-to-Row Delay” (tRRD), which is needed to satisfy the row decoder latency. tRRD specifies the latency of jumping from one open row to another open row of different banks within the memory subsystem space. tRRD is typically on the order of 2 clocks and can usually be hidden behind ongoing data bursts, and therefore is likely to constitute a problem in cases of random, single-word accesses (but not otherwise).
Current SDRAM's do offer a programmable Read or Write Auto-Precharge function which in some cases reduces the tRP latency by hiding part of it. However, this is at the cost of closing the currently open row, thereby negating the advantage of the open page architecture, namely the extremely fast Reads/Writes of sequences of data which are located contiguously (i.e., in the same open page) in the SDRAM. Therefore, while the Auto-Precharge function can hide tRP latency and permit faster random accesses, it also defeats the principal advantage of the open page organization.
In view of the above, the performance of RAM devices can be limited by the inability to hold multiple pages open on the same bank of the memory subsystem. A worst case scenario would be encountered if data from two pages are requested where, after closing the first page and moving to the second page, the first page is needed again for additional data. The possibility of incurring this access pattern also constitutes the main performance limitation of pseudo-SRAM (pseudo static random access memory, or PSRAM) since it disallows single random accesses without satisfying the minimum RAS pulse width. Flash memory devices, both NOR and NAND type, also use a “page” architecture, for accessing the memory device. Therefore, as with DRAM devices, flash memory devices also suffer from the inability to hold multiple pages open on a single bank.