The present invention relates to a semiconductor device in which dynamic RAMs (DRAM cells) are accumulated, and particularly to a DRAM which can improve the data transfer efficiency in a read/write mixed cycle in a high-speed random access cycle.
Of the MOS type semiconductor memory devices, the DRAM is most highly integrated, since the memory cells constituting the device are comparatively simple in structure. Hence, at present, the DRAM is used as a main memory of any type of computer equipment. Recently, as the performance of the microprocessor (MPU) has been rapidly improved, various DRAMs having high-speed data cycle functions to increase the capacity of memories have been proposed or mass production thereof has begun. Typical examples of these DRAMs are a synchronous DRAM (hereinafter referred to as a SDRAM) and a double data rate SDRAM (hereinafter referred to as a DDR-SDRAM). The SDRAM receives and transmits any input and output information in synchronism with a system clock. The DDR-SDRAM performs a similar operation and is accessible at both up and down edges of a clock as triggers.
Further, a rambus DRAM (hereinafter referred to as an RDRAM) and the like have been developed, which can transfer data at higher speed by a protocol-based command. Therefore, the conventional asynchronous DRAMs will inevitably be replaced by synchronous DRAMs in the future.
The synchronous DRAMs are characterized in that the maximum bandwidth (data transfer rate) is very high. For example, the latest SDRAM achieves 100 Mbps in the maximum bandwidth.
Further, it is expected that the maximum bandwidth in the future is 200 Mbps in a DDR-SDRAM and 800 Mbps in an RDRAM.
However, such a high bandwidth is limited to a burst access only in a specific row direction in a memory space.
In other words, in random access wherein the row address is changed, the access speed is as low as that in the conventional asynchronous DRAM. To increase the access speed, the computer system including a DRAM as a main memory generally employs a hierarchical memory structure.
More specifically, a cache memory comprising a SRAM, which is accessible at a higher speed as compared to a DRAM, is interposed between the MPU and the DRAM, and part of the information stored in the DRAM is cached in the SRAM. In this structure, the MPU generally accesses the cache memory accessible at a higher speed. It accesses the DRAM only when it receives an access command for an address space which is not cached by the cache memory. By means of this structure, even if there is a difference in speed performance between the MPU and the DRAM, the performance of the computer system can be considerably improved.
However, in case of a cache miss, it is necessary to read information from the DRAM. In particular, when another address in the same block of the DRAM memory space is accessed, the waiting time of the MPU becomes the longest. The problem of the waiting time in, for example, an SDRAM, will be described below with reference to FIG. 1.
FIG. 1 shows an example of the timing chart of a read operation of an SDRAM. In the aforementioned computer system using the hierarchical memory structure, if a cache miss occurs and the SDRAM as the main memory must be accessed, a precharge command (PRECHARGE) is issued from the system at a time t1 to precharge a currently active address of the memory. After a predetermined time elapses, an activate command (ACTIVE) is issued from the MPU, so that the bank corresponding to a required memory space is activated. Further, after the elapse of a predetermined time, a read command (READ) is issued. After a time t2 after a predetermined time has elapsed since the read command, data of a predetermined burst length is read from the SDRAM in synchronism with a clock.
As shown in FIG. 1, the maximum bandwidth is very high when data is read successively in synchronism with clocks. However, in case of a cache miss, the practical bandwidth with respect to random access is considerably low. In other words, in a period between the times t1 and t2, the time when data is not read out, namely, the waiting time of the MPU, is long.
In the case of the SDRAM of the specification as shown in FIG. 1, the maximum bandwidth in the random access time is only 36% of that of the burst access time. It is highly possible that the slow access will be a bottleneck for further improvement of the performance of the computer system.
In consideration of the above situations, there has been an increased demand for a high-performance DRAM which realizes a higher access and a shorter cycle time. Particularly, in a multi MPU system such as a current high-performance server machine, not only high-speed burst transference but also high-speed random access is regarded as very important. Further, in a household multimedia system mainly for the purpose of real-time reproduction of an animation image in the future, there will be a demand for a similar DRAM that allows high-speed random access.
The DRAMs, which will meet such a demand, are an enhanced SDRAM (hereinafter referred to as an ESDRAM) as shown in FIG. 2 published by Enhanced Memory Systems Inc. and a virtual channel memory (hereinafter referred to as VCM) as shown in FIG. 3 published by NEC Corporation.
In the ESDRAM, however, each bank incorporates a SRAM cache 101, as shown in FIG. 2. In the VCM, 16 1K-caches 102 comprising register circuits are mounted. Thus, the DRAM of this kind has a great number of cache memories in addition to the conventional DRAM memory cell array. Since high-speed access and a short cycle are realized by many cache memories, overheads are high relative to the chip size. Therefore, it is difficult to lower the cost.
Both high-speed random access and low cost can be achieved by a method in which, the idea of “the page cycle” function, an operation mode of the conventional DRAM, is not used. According to this method, when a very little amount of cell data has been detected and amplified in the DRAM operation, a precharge operation is automatically stated immediately.
More specifically, as shown in FIG. 4, when a read command (RCMD#1) is issued at a time t1, activation of a word line (WL) is started and cell data is read out to a group of bit lines (bBL/BL). Thereafter, a sense amplifier is activated at a time t2. When cell data is detected by the sense amplifier, a column selection line (CSL) is activated at a time t3, and bit line data is transferred to a data line (not shown) in the chip and output through the data line to the outside of the chip. The sense amplifier amplifies the cell data to a desired voltage in a period of time in which data is transferred through the line between the data line and the read out section in the chip. When the amplification is completed at a time t4, a series of precharge operations, e.g., inactivation of the word line (WL) and precharge of the bit line, are automatically started. Thus, although the DRAM does not have a page access function, a series of access sequences can be completed in the minimum time, resulting in high-speed random access in a short cycle.
Further, an improved synchronous memory for improving the data transfer performance to the maximum has been devised. In the devised memory, a so-called read latency (R.L.), i.e., a time between setting of a read command and establishment of read data, is set to the same clock cycle value as that of a so-called write latency (W.L.), i.e., a time between setting of a write command and preparation of effective write data. A no bus latency SRAM (NoBL SRAM) proposed by Cypress Semiconductor Corporation is an example of such a memory.
The conventional pipeline SRAM requires a period of four clocks to realize a read/write mixed cycle, as shown in FIG. 5. On the other hand, as shown in FIG. 6, the NoBL SRAM requires two clocks, i.e., half the clocks required by the conventional art.
As described above, R.L. and W.L. are set to the same clock cycle value (two clock cycles in FIG. 6) in the NoBL SRAM. As a result, a data reading operation and a data writing operation can be executed without an unnecessary idle cycle, with the result that the data transfer performance can be improved.
When the above method for setting R.L. and W.L. to the same clock cycle value is applied to a DRAM, problems as described below will occur.
The DRAM is different from the SRAM in internal operation of the memory. In the DRAM, data must be read out from the sense amplifier after row operations for driving a word line (WL), driving a sense amplifier, etc., as shown in FIG. 4. In other words, reading of data from the memory cell of the DRAM requires a certain limited time after the row operations are completed, i.e., after cell data is detected and amplified by the sense amplifier. An example of the internal read operation is shown in FIG. 7. In FIG. 7, the internal timing of the read operation is illustrated in association with time.
In FIG. 7, it is assumed that a period of 10 ns is required after setting of the read command until the word line is activated and cell data is read out from the memory cell to the bit line (WL Activation: W.ACT.), a period of 5 ns is required to detect the cell data by the sense amplifier (Sensing: SENSE.), a period of 10 ns is required to amplify the cell data by the sense amplifier (Restore: RSTRE.), and a period of 5 ns is required for precharging (Equalize: EQL.). In this case, the cycle time of the DRAM is 30 ns.
As shown in FIG. 7, without the function of the page cycle, an operation of reading data from the DRAM can be performed parallel with amplification of cell data upon completion of detection of the sense amplifier. This is because precharging (EQL.) is started automatically upon completion of detection (SENSE.) and amplification (RSTRE.) of the cell data by the sense amplifier.
It is assumed that a period of about 8 ns is required to read the cell data out of the chip through the data line inside the chip (Data Transfer: D.TRS.). In this case, if the column selection line (CSL) is activated at the timing when the sense amplifier has completed detection of the cell data, a period (ACCESS TIME) of about 25 ns is required since the setting of the read command until the data is actually read out of the chip.
Assuming that the data is transferred to the data bus in synchronism with a rise of an external clock CLK, R.L. is 3 clock cycles as shown in FIG. 7 (this condition is defined as R.L.=3).
An operation of writing data to the DRAM will now be described. If W.L. is set to the value same as R.L., 3 cycles (W.L.=3), established write data is taken in the chip and transferred to the sense amplifier through a data line in the chip. However, as clear from FIG. 7, at the timing of the third clock from the setting of the write command, the DRAM is already in the precharge (EQL.) state. Therefore, it is impossible to write data in the memory cell.
This problem can be overcome by setting the time required before precharging in a write operation longer than the time required before precharging in a read operation. In other words, it is only necessary that the cycle time in the write operation be set longer than the cycle time in the read operation. However, to increase the cycle time in the write operation, the data transfer efficiency is considerably reduced in a read/write mixed cycle, with the result that the merit of a high-speed random accessing cycle DRAM is impaired.
As described above, in the DRAM in which a high-speed random access in a short cycle is realized by eliminating the function of the page cycle, the clock cycle values of the read latency (R.L.) and the write latency (W.L.) are different from each other. Therefore, it is difficult to improve the data transfer efficiency in operations of continuously writing or reading data in or from bits corresponding to different row addresses on the same page.
If the clock cycle values of the read latency (R.L.) and the write latency (W.L.) are the same, the cycle time in the write operation must be longer than that in the read operation, in order to prevent the DRAM from a precharging state when write data is input. Therefore, the data transfer efficiency cannot be improved.