1. Field of the Invention
The present invention relates generally to semiconductor memory devices, and more specifically, to clock synchronous semiconductor memory devices which operate in synchronization with an externally applied clock signal. More specifically, the present invention relates to a cache DRAM including a DRAM (Dynamic Random Access Memory) having dynamic memory cells and an SRAM (Static Random Access Memory) array having static memory cells.
2. Description of the Background Art
In recent years, microprocessing units (MPUs) have come to operate at very high speed with operation clock frequency being 25 MHz or higher. In a data processing system, a standard DRAM is often used as a main memory with a large storage capacity for its low cost per bit. Although the standard DRAM has access time reduced it cannot keep pace with development of MPUs in terms of high speed operation. A data processing system using the standard DRAM as a main memory faces a short coming such as increase in wait state. The gap between the operation speeds of a MPU and a standard DRAM is inherent for the following characteristics of the standard DRAM.
(i) A row address signal and a column address signal are time-divisionally multiplexed for application to a common address pin terminal. The row address signal is taken into the device at a falling edge of a row address strobe signal /RAS. The column address signal is taken into the device at a falling edge of a column address strobe signal /CAS.
Row address strobe signal /RAS defines the start of a memory cycle and activates row selecting circuitry. Column address strobe signal /CAS activates column selecting circuitry. A prescribed time period called "RAS-CAS delay time (tRCD)" is necessary between activation of signal /RAS and activation of signal /CAS. This address multiplexing restricts reduction of accessing time.
(ii) Once row address strobe signal /RAS is raised to set a DRAM to a standby state, row address strobe signal /RAS cannot be pulled down once again to an active state or "L" until time called RAS precharge time (tRP) passes. RAS precharge time tRP is necessary for securely precharging various signal lines of the DRAM to prescribed potentials. The presence of RAS precharge time tRP keeps the cycle time of the DRAM from being reduced. Reducing the cycle time of the DRAM increases the number of charge/discharge of signal lines in the DRAM, resulting in increase in current consumption.
(iii) The operation speed of DRAMs can be improved by means of improving circuit techniques and process techniques such as high density integration of circuits and improvement of layouts as well as improvements in terms of applications such as improvement of driving methods. Development of the operating speed of MPUs however, advances much more than that of DRAMs. The operation speed of semiconductor memory is hierarchical such that there are high speed bipolar RAMs using bipolar transistors such as ECLRAM (Emitter Coupled RAM) and static RAMs, and relatively low speed DRAMs using MOS transistors (insulating gate field effect transistors). Speed (cycle time) in the order of several tens ns (nano second) would not be expected in a standard DRAM having a MOS transistor as a component.
One method for solving the above problems and implementing a relatively inexpensive and small scale system is to build a high speed cache memory (SRAM) in a DRAM. More specifically, one chip memory having a hierarchical structure including a DRAM as a main memory and an SRAM as a cache memory can be considered. Such hierarchical one chip memory is referred to as cache DRAM (CDRAM).
In a CDRAM, a DRAM and an SRAM are integrated on a single chip. The SRAM is accessed upon a cache hit, and the DRAM is accessed upon a cache miss. More specifically, the SRAM operating at a high speed is used as a cache memory, while the DRAM having a large storage capacity is used as a main memory.
A so-called block size of cache can be considered as the number of bits whose contents are rewritten through one data transfer in an SRAM. Cache hit rate generally increases as a function of the size of a block. For the same cache memory size, however, since the number of sets decreases in inverse proportion to the block size, the hit rate decreases conversely. For a cache size of 4K bit, for example, the number of sets is 4 for a block size of 1024 bits, while the number of sets is 128 for a block size of 32 bits. Accordingly, the block size must be appropriately set.
A CDRAM having an appropriate block size is for example shown in Japanese Patent Laying-Open No. 1-146187 by Fujishima et al.
In the prior art, a DRAM array is divided into groups of a plurality of columns. A data register is provided for each column. The data register is also divided into groups similarly to the DRAM array. Upon a cache hit, the data register is accessed. Upon a cache miss, only data in a column group in the array of the DRAM is transferred to the data register according to a block address. Data is read out from the data register in parallel with the data transfer.
In the above-described conventional CDRAM, data is transferred from the DRAM array to the data register at the time of cache miss. At the time of transfer, the CDRAM is not accessible. The external processing device must wait until transfer of valid data to the data register is completed. This degrades the performance of the system.
A CDRAM having a DRAM array and an SRAM array integrated on a single chip and a bidirectional transfer gate between the DRAM array and the SRAM array has been suggested. The DRAM array and the SRAM array can be independently addressed. The bidirectional transfer gate includes a data register, which is externally accessible. Thus, a highly functional CDRAM also applicable to graphics processing is implemented. In such a CDRAM, however, access to the data register is prohibited when data is transferred from the DRAM array to the bidirectional transfer gate. Therefore, there is still room for improvement in such a high function CDRAM.
In order to operate a semiconductor memory device at a high speed, the semiconductor memory device is operated in synchronization with an externally applied clock signal such as a system clock signal (see U.S. Pat. No. 5,083,296 to Hara, for example). The prior art provides for a solution to variation in timing caused by distortion in external control signals such as signal /RAS and/CAS. Such a clock synchronous semiconductor memory device establishes the output of an input buffer receiving an external signal when the external clock signal is activated.
Therefore, since an internal signal is established after an external clock signal is activated and then an internal operation is executed, a timing for starting the internal operation is delayed. More specifically, the advantage of high speed operation with an external clock signal is impaired.
It is therefore an object of the present invention to provide a semiconductor memory device which operates at a high speed.
Another object of the present invention is to provide a semiconductor memory device which enables a high speed data processing system to be constructed.
Yet another object of the invention is to provide a synchronous semiconductor memory device capable of establishing an internal clock signal in a timing as early as possible in synchronization with an external clock signal.
A particular object of the invention is to provide a clock synchronous cache built-in semiconductor memory device which permits high speed accessing with no wait.
A semiconductor memory device according to the invention includes a memory cell array having a plurality of memory cells, a first data register for temporarily holding data from a plurality of memory cells simultaneously selected in the memory cell array, a second data register for receiving the data held by the first data register for storage, and transfer means responsive to the absence of access to the second data register and a data transfer instruction for executing data transfer from the first data register to the second data register.
In the semiconductor memory device according to the present invention, data is transferred from the first data register to the second data register when data in the second data register is not used. Therefore, the data transfer operation does not adversely affects accessing to the semiconductor memory device, and high speed operation is implemented.
The external processing device does not enter a wait state due to data transfer within the semiconductor memory device, in other words the device can operate in a "no-wait state", and therefore a high speed data processing system can be constructed.
The foregoing and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings.