The present invention relates to a semiconductor memory device, such as a synchronous DRAM (Dynamic Random-Access Memory), which operates in synchronism with a clock signal.
Recently, synchronous DRAMs have been developed which can access data at high speed as the conventional SRAM (Static Random-Access Memory), thereby to provide high data-band width (i.e., the number of data bytes per unit time). Hitherto, 16M-bit synchronous DRAMs and 64M-bit synchronous DRAMs have been put to practical use. The greatest advantage of a synchronous DRAM resides in that data can be read from a synchronous DRAM at a higher bit per second, referred to as "Band width", than from the ordinary DRAM. More precisely, the data latched to any bit line controlled by the column-system circuit of the memory cell array can be output to an input/output (I/O) pins within a shorter column cycle time than in the ordinary DRAM. In other words, the column cycle time (tck) is shorter than in the ordinary DRAM.
A synchronous DRAM operates in synchronism with the leading edge of the clock signal supplied to the clock-signal input pin. In this respect the synchronous DRAM greatly differs from the conventional DRAM.
FIG. 12 shows pipeline architecture, which is the circuit design most commonly used to reduce the above-mentioned column cycle time (tck). This is a data-path architecture having three stages provided by dividing a data path by the clock cycle. Any one of these stages overlaps another in the same cycle. In the first stage, a column address is designated and a column access is determined. In the second stage, a data-line pair is selected among the data-line pairs provided in the memory cell array, and data is amplified so that it may be read out. In the third stage, the data amplified is read to the input/output pins.
In response to the first address A0 in the memory cell array, the pipeline architecture outputs data item DQ0 designated by the first address A0 and also data items DQ1, DQ2 and DQ3 that follow the item DQ0, one after another, at high speed. This high-speed data access is generally known as "burst reading."
A synchronous DRAM is characterized in that the Column latency (CL) can be changed by means of mode-setting. The latency is the number of clock pulses which define the time between the clock cycle in which a read command is given and the clock cycle in which the data to read is acquired. The latency is decreased in a system wherein the cycle of the clock signal cannot be shortened so much. Conversely, the latency is increased in a system to which a high-speed clock signal can be supplied. Usually, CL=2 in the system wherein the cycle of the first-mentioned system, and CL=3 in the second-mentioned system. In general, the cycle time tCK is inversely proportional to the latency. The shortest cycle time is 1/100 MHz (=10 ns) in a synchronous DRAM in which CL=3 and to which a 100 MHz clock signal is supplied, and is 1/(100*2/3) (=15 ns) in a synchronous DRAM in which CL2 and to which a 100 MHz clock signal is supplied.
FIG. 13 shows a conventional pipeline architecture that meets the specification described above. The same time lapses from the inputting of a column address in a synchronous DRAM to the outputting of the data to the input/output pins as in the conventional DRAM. The time is, for example, 30 ns. In the pipeline architecture of FIG. 13, the data bus is divided into two stages if CL=2 and into three stages if CL=3. In the conventional DRAM the data path for transferring the data latched in the memory cell array to the input/output pins cannot be divided so freely as is possible in a microprocessor unit (MPU). This is why the data path is divided into three stages in most cases, as is illustrated in FIG. 12. Obviously, the conventional pipeline architecture can meet the specification of CL=3 if the data path is divided into three stages ST1, ST2 and ST3 as shown in FIG. 13. The first stage ST1 includes an address latch circuit 130a and a column decoder 130b. The second stage ST2 includes a transfer gate 130c, a latch circuit 130d, a data line 130e, and a read amplifier 130f. The third stage ST3 includes an output latch circuit 130g and an output drive circuit 130h. The address latch circuit 130a is driven by a clock signal CLK1. The transfer gate 130c is driven by a clock signal CLK2. The data line 130e is connected to the bit lines of the memory cell array (not shown). The output latch circuit 130g is driven by a clock signal CLK3.
When the pipeline architecture of FIG. 13 is set in the mode of CL=2, it is necessary to short-circuit either the first and second stages or the second stage and third stages, thereby to reduce the number of stages to two. Generally, the second stage of the data bus of a DRAM is used for a long time to read data from the memory cell array, amplify the data thus read, and transfer the data to the input/output circuit. It is used longer than any other stage of the data bus and therefore has a small margin of cycle time. Hence, to switch the latency (CL) from 3 to 2, the power supply voltage Vcc is applied to the transfer gate 130c, instead of supplying the clock signal CLK2 thereto. As long as the voltage Vcc drives the transfer gate 130c, it connects the first stage ST1 and the second stage ST2. This stage-connecting method is most popular for the conventional pipeline architecture, because it defines the most simple circuit structure.
However, the operating time available for each stage of the data path is, of course, limited to the cycle time tCK. In the case of a DRAM whose clock-signal frequency is 100 MHz, the operating time of each stage is 10 ns if the latency (CL) is 3, and the stage defined by short-circuiting the first stage ST1 and the second stage ST2 requires an operating time of at most 20 ns (=10 ns+10 ns). If the latency (CL) is 2, each stage must be operated within 15 ns as may be understood from FIG. 13. Hence, with the conventional pipeline architecture it is necessary to drive each stage at a sufficiently high speed when the latency (CL) is 3, so that each stage within 15 ns may operate well when the latency (CL) is set at 2. From a viewpoint of circuit designing, however, it is difficult to drive each stage at a sufficiently high speed when the latency (CL) is 3.