The invention relates to a circuit arrangement for time delaying read data read from a semiconductor memory with a predetermined read latency. The invention also relates to a semiconductor memory circuit and a method.
In modern computer and software applications, there is an increase in requirement for processing ever greater volumes of data in an ever shorter time. To store the data, highly integrated memories such as, e.g. synchronous dynamic random access memories (S-DRAM) are used. S-DRAMs are standard memory chips which consist of highly integrated transistors and capacitors and which allow a memory access without additional weight cycles. The data are transferred between the S-DRAM and an external data bus synchronously with an external clock signal.
FIG. 1 of the drawing shows a section of a part of a S-DRAM as disclosed in German Patent No. 102 10 726 B4 and there, in particular, in FIG. 1, only its read path being shown in FIG. 1. The S-DRAM 1 contains a memory cell array 2. Data are read out of the memory cell array 2 via a read amplifier 3 and an internal data bus 4 clocked via an internal clock signal CLK. For the synchronous data output, the read data path contains a data buffer FIFO 5. The read data temporarily stored in the data buffer FIFO 5 are read out of the data buffer FIFO 5 via an off-chip driver (OCD) 6 and supplied for further processing to other communication parties, for example a microcontroller, via an external data bus 7.
The data buffer FIFO 5 is driven by the read amplifier 3 via a read pointer INP and by a read latency generator 8 by means of an output pointer OUTP. The output pointer OUTP acts as time-delayed data release signal. For controlling, and thus for adjusting the read latency, the read latency generator 8 is connected at its input, via a decoder, not shown, to a mode register in which the latency information for the various operating modes of the S-DRAM are stored.
FIG. 2 of the drawing shows a schematic signal/time diagram for a read-out process initiated by a read command RD. In the case of a read access, the read data D0-D3 to be read out of the memory cell array 2 are read out of the memory cell array 2 with a known signal delay and pass via the read amplifier 3, the internal data bus 4 and the data buffer FIFO 5 to the inputs of the OCD driver 6. In FIG. 2, tAA designates this signal delay, that is to say the read-out time tAA, which is needed for reading the read data D0-D3 out of the memory cell array 2 and to supply them to the OCD driver 6. The read-out time tAA is typically several clock pulses of the internal clock signal CLK long. The following thus applies:tAA>k*tCK, where tCK is the duration of a single clock pulse of the internal clock signal CLK and k is an integral multiple.
The OCD driver 6 forwards the read data D0-D3 read out and correspondingly forwarded with a further signal delay, also known, to the external data bus 7. tDP here designates the propagation time through the OCD driver 6.
On the basis of the known signal delays tAA, tDP, the so-called read latency ΔT is defined. The read latency designates the period of time ΔT which is needed at a minimum for reading read data D0-D3 out of the memory cell array 2 and providing them at the output of the OCD driver 6, taking into consideration the signal delays tAA, tDP. This read latency ΔT is an integral multiple of one clock pulse of the internal clock signal CLK and thus of the period tCK so that the following applies:ΔT=n*tCK. 
However, the data d0-D3 are read out of the memory cell array 2 by using the internal clock signal CLK whereas the data D0-D3 are read out at the output of the OCD driver 6 by using an external clock signal DLL-CLK. This external clock signal DLL-CLK is typically generated via a DLL circuit specially provided for this purpose. The internal clock signal CLK is typically synchronous with the external clock signal DLL-CLK. The read latency ΔT is therefore typically greater by ΔT1 than the sum of the read-out time tAA and the propagation time tDP so that the following applies:ΔT=n*tCLK>tAA+tDP, and thusΔT=tAA+tDP+ΔT1
This read latency ΔT is known. The read latency ΔT is typically generated by the read latency generator 8 which shifts the output pointer OUTP correspondingly by n clock pulses of the read latency ΔT with respect to the input pointer INP. These n clock pulses of the read latency ΔT are counted by a read latency counter 8 specially provided for this purpose, which correspondingly shifts the output counter OUTP by a number n of the clock pulses of the clock signal CLK with respect to the input counter INP.
In the implementation of a read latency counter, a FIFO-based concept is mostly used in which the chip-internal read signal RDint is shifted by the programmed read latency ΔT, controlled by the read latency generator 8, and is converted into the domain of the external clock signal DLL-CLK. FIG. 3 of the drawing shows this concept by means of a block diagram. FIG. 3 shows a data buffer FIFO 9 with four individual FIFO cells 9a, that is to say the FIFO depth FT=4. The clock domain is shifted, for example, by the input pointer INP0 opening the cell “0” of the data buffer FIFO so that, following this, the internal data signal RDint is read in. At the same time, for example, the output counter OUTP1 is active. The consequence is that the internal data signal RDint is read out of the cell 0 only three clock pulses later (see FIG. 3A), the premise being that each of the input pointers INP0-INP3 and output pointers OUTP0-OUTP3 is in each case active alternately and in succession for the duration of one clock pulse.
During a read-out process, the number of FIFO cells 9a necessary for reading out data corresponds, for example to the maximum programmable read latency ΔT, depending on implementation. The consequence is that for very high read latencies ΔT, the number of FIFO cells 9a must be correspondingly selected to be very large. As shown in FIG. 3, the outputs of these individual FIFO cells 9a are all short circuited to one another, however, in order to combine the data signal RDint, read out internally, to form a common data signal RDout externally. Thus, each output of a FIFO cell is in each case connected to an external load which, overall, leads to increasingly worse edges, that is to say edges which are becoming flatter, of the data signals read out. An additional aggravating factor is that the respective output lines of the FIFO cells 9a can become very long with a very large FIFO depth which, naturally, can lead to correspondingly undesirably high parasitic capacitances. This, too, has an increasingly negative effect especially with a very large number of FIFO cells 9a. 
The higher the operating frequency of the memory chip, the more these problems predominate since the read latency ΔT becomes increasingly larger with reference to the clock signal which is dependent on the frequency, with constant read-out times tAA and propagation times tDP. This leads directly to a greater number of FIFO cells and thus to a greater FIFO depth. The read signal RDout read out at the output has increasingly flatter edges particularly at very high operating frequencies and an associated great FIFO depth. This, in turn, makes it more difficult to synchronize the data signal RDout with the external clock signal DLL-CLK and thus to decode the external data signal RDout in a defined manner.