Dynamic random access memory circuits (DRAMs) are used in computers and other electronic machines needing temporary storage of data. These circuits have advantages over other types of memory circuits in that they provide the greatest density of memory cells for a given area of semiconductor, a low relative cost-per-bit of stored data, and relatively high speed. DRAMs have increased in both size and in operating speed to match the demands of system designers using modem microprocessors, which often have clock rates in excess of 100 MHz. Indeed, with each new generation of DRAM, the number of memory cells on the integrated circuit increases by a factor of four. In an effort to accommodate systems that demand more and faster data, the industry has turned to DRAMs that synchronize the transfer of data, addresses, and control signals with a clock signal, one that is typically tied to the microprocessor if the system is a computer.
While it is desirable to tie the functioning of the memory to an external clock to speed data transfer and synchronize data input and output, the array access complexity and routing parasitics due to the size of the circuits that must be accessed to store or retrieve data in a DRAM make it difficult for the memory circuit to respond on every cycle of a high-frequency clock. A solution to this problem is to allow for memory operation delay by a given number of cycles, but eventually have the memory store or retrieve data on the clock cycle as desired by the system designer. This delay in synchronous DRAMs is referred to as "latency." It is a common design practice for the latency of the memory circuit to be selectable by the system designer, typically in increments of 1, 2, 3, or 4 clock cycles, depending upon the operating frequency of the microprocessor upon which a computing system is based, for example.
The conventional method for implementing latency in memory circuits in the past has been to insert memory registers, similar to D-type flip-flops, in the input/output data paths. For example, if the system latency requirement is three system clock cycles for receiving read data, two registers are placed in each of the memory circuit's output data paths (the latency is three because during one cycle the data occupies a sense amplifier). For a system latency of two, only one register per data path is required. Latency in the write function is achieved through separate registers in the input data path. The memory circuit designer makes the latency selectable by providing circuitry that simply includes more registers in the data path to increase the latency of the memory circuit, or bypass a set of registers if less latency is achievable.
While conceptually simple, the circuitry for implementing the conventional latency scheme is cumbersome and occupies more die space than is desired. For example, each register requires approximately ten transistors for implementation. A memory circuit with 32 data lines, a four-cycle read latency, and a one-cycle write latency would require approximately 128 registers, or 1280 transistors, for implementation using the conventional approach. An added complication is that each data path on every die should be thoroughly tested prior to shipment to a customer. The large number of transistors involved in the latency circuitry certainly adversely affects the yield and increases the test time of DRAMs. These problems motivate the need for a new approach to memory circuit designs.