The present invention relates to memories, and in particular to memory read operations.
To increase the read operation bandwidth, multiple data items can be prefetched in parallel from a memory array for a serial output. For example, in DDR2 (double date rate 2) synchronous dynamic random access memories (DRAMs), four data bits are prefetched in parallel for a serial output on the rising and falling edges of a clock signal in a burst read operation. DDR2 is defined in the DDR2 standard JESD79-2A (JEDEC Solid State Technology Association, January 2004) incorporated herein by reference. The DDR2 memory is pipelined, and the next read command can be issued to the memory before completion of the data output for the previous read command. Therefore, care must be taken to ensure that the prefetched data does not overwrite the data from the previous prefetch operation. Further, the DDR2 specification requires the memory to provide a variable, user-programmable latency (“CAS latency”) defined as a latency between the receipt of the read command and the start of the serial data output. See FIG. 1 showing the data timing for the CAS latency (“CL”) values 2, 3, 4, and 5 and a burst length of 4 for three read commands Ra, Rb, Rc issued on the rising edge of respective clock cycles 0, 2, and 4. Terminal DQ is an output terminal (actually an input/output terminal). The read data D0–D3 are marked as “A DATA” for command Ra, “B DATA” for command Rb, and “C DATA” for command Rc. The data are driven on the DQ terminal beginning in cycle 2 for CL=2, beginning in cycle 3 for CL=3, beginning in cycle 4 for CL=4, and beginning in cycle 5 for CL=5. (The data can actually be driven slightly earlier to ensure that the data are valid on the rising edge of the respective CLK cycle.) The programmable CAS latency requirement complicates the data output pipeline.
U.S. Pat. No. 6,600,691 B2 issued Jul. 29, 2003 to Morzano et al. describes a data output pipeline circuit with two stages, each stage having four latches for the respective four prefetched data bits. The four bits are written in parallel to the first stage, and from the first stage to the second stage. Then the data are converted to the serial format and written out to the output terminal. Control signals are generated to control the two stages and the parallel-to-serial conversion to provide the required timing for different CAS latencies and ensure that the subsequent data do not overwrite the previous data.