The present invention relates to a semiconductor memory apparatus, or more in particular to a technique effectively applicable to a cache memory built in a data processing apparatus like a microprocessor or a microcomputer.
In recent years, with the increase in the operating frequency of the microprocessor, there has been an increasing demand for a cache memory of a higher speed. In the cache memory, data are written by transmitting signals to memory cells through bit lines, and are read from the memory cells by transmitting them to an amplifier circuit through bit lines. For a cache memory of a high operating speed to be realized, therefore, it is crucial to reduce the capacity of the bit lines. Memories with a reduced capacity of bit lines include a circuit with a memory mat divided and bit lines formed in a hierarchy (hereinafter referred to as the prior art 1) disclosed in ISSCC Digest of Technical Papers, pp. 304-305, February, 1995.
The memory according to the prior art 1 comprises a memory mat with 6-transistor memory cells arranged in an array and divided into n equal parts to form n blocks. The bit lines (BL, BLB) in each block are connected with an I/O bus formed across the bank through a sense amplifier (S/A) and an I/O circuit configured in a pair with each block.
In reading data, the data read from each memory cell is transmitted to the sense amplifier (S/A) and the I/O circuit using the bit lines (BL, BLB) thereby to output data to an I/O bus. Data are written in the memory cells by transmitting the data in the I/O bus to the bit lines (BL, BLB) using the sense amplifier (S/A) and the I/O circuit.
An object of the present invention is to provide a cache memory in which high-speed storage is possible with a reduced area of a memory cell or a memory array, and a semiconductor apparatus comprising such a cache memory.
Another object of the invention is to provide a high-speed cache memory and a semiconductor apparatus comprising such a cache memory with a reduced power consumption.
In the case where a memory according to the prior art 1 is used as a data array of a cache memory, the storage in the cache memory cannot be processed at high speed. The reason will be described below.
The storage is a process in which data are written after a data array receives a hit signal constituting a write permit signal from a tag array. The time required for this process is the sum of xe2x80x9cthe time before establishment of a hit signalxe2x80x9d and xe2x80x9cthe time for writing the dataxe2x80x9d. The xe2x80x9ctime before establishment of a hit signalxe2x80x9d is the sum of the time required for reading the tag array and the time required for comparing the address read from the tag array with the tag address. As a result, the storage process is lower in speed than the operation of reading from or writing into an ordinary memory not using the establishment of a hit signal. In view of the fact that the tag array and the data array can be accessed at the same time, the time of processing the loading from a cache memory equal to that of the operation of reading from an ordinary memory can be achieved. Specifically, in the case where the operating frequency of a microprocessor is comparatively low (say, 20 to 30 MHz or less), the resulting long machine cycle makes it possible to realize the storage in one cycle. With the increase in the operating frequency of the microprocessor (say, to 50 MHz or more with one machine cycle of 20 msec or less), however, the storage cannot be realized in one cycle. Especially in the case where the cache memory is accessed with an output address of a conversion buffer for converting a logic address into a physical address, the establishment of a hit signal is slower and the storage in one cycle is harder to realize. In the conventional cache memory built in the microprocessor adapted for high-frequency operation, therefore, the storage is effected in two cycles and the loading in one cycle. In the case where the microprocessor employs a pipeline processing scheme, the memory access stage requires two cycles for storage, with the result that the pipeline is disturbed, thereby constituting a bottleneck to an improved speed of the microprocessor. In view of this, the memory access stage always has two cycles, i.e., the number of pipeline stages is increased in order to prevent the pipeline from being disturbed. An increased number of pipeline stages, however, leads to the problem of an increased power consumption.
The time required before establishment of a hit signal is a stumbling block to an increased storage speed. The present inventor has studied a method of writing data in a data array before establishment of a hit signal as a method of processing the storage at high speed. In such a case, no problem is posed when the hit signal represents a xe2x80x9chitxe2x80x9d indicating the write permission at the time point when the hit signal is established after the write operation. In the case where the hit signal represents a xe2x80x9cmishitxe2x80x9d, however, it is necessary to restore the value before writing the data in the data array. The result is the necessity of reading and holding the data at the write position before the write operation.
In other words, in the case where the storage is effected ignoring the hit signal to increase the speed of storage process, the two operations of reading and writing data are required to be performed continuously in a single cycle. Unless this continuous read and write operation cannot be performed at high speed, a high-speed storage is impossible to achieve even if the hit signal is ignored.
In the memory of the prior art 1, the operating speed is increased by a reduced capacity of the bit lines when the write operation is performed as a storage operation after reading data at the same address. The need of performing the write operation after a complete read operation using the bit lines and the I/O bus, however, lengthens the processing time as compared with the normal read or write operation. In other words, the read operation requires one cycle and the write operation one cycle, thus requiring a total of two cycles for storage.
JP-A-4-85789 (hereinafter referred to as the prior art 2), on the other hand, discloses a memory in which what is called a dual-port memory cell connected to a read address signal line, a write address signal line, a read data line and a write data line is used in such a manner that the read side discharges while the write side is precharging, and vice versa, thus apparently executing the read and write operations at the same time. The use of the dual port memory cell, however, poses the problem of an increased area of the memory cell and the memory cell area. Further, the increased area increases the bit line capacity, resulting in a longer memory access time and a longer memory cycle time.
JP-A-3-216892 (U.S. Pat. No. 5,387,827) (hereinafter referred to the prior art 3), JP-A-3-3195 (hereinafter referred to as the prior art 4) and IEEE Journal of Solid-State Circuits, Vol. 23, No. 5, October 1988, pp. 1048-1053 (hereinafter referred to as the prior art 5), on the other hand, disclose a memory in which a common read line and a common write line are connected through a bit line and a MOS transistor. None of the prior arts 3, 4 and 5, however, has the description of the possibility of concurrent execution of the read and write operations. All the prior arts 3, 4 and 5 concern a memory of BiCMOS (Bipolar CMOS (Complementary Metal Oxide Semiconductor)). The prior art 4 which has no direct description of the BiCMOS, however, cites the prior art 5 as a conventional technique. The use of the BiCOMS circuit can realize a high-speed memory at the sacrifice of a larger power consumption than the memory of the CMOS circuit.
It is important to realize a high-speed cache memory only with a CMOS circuit. If the power consumption of circuits integrated in a single semiconductor device is not more than 1.5 W, the resin sealing with a resin mold technique or the like becomes possible, and the cost of the semiconductor device can be reduced considerably as compared with ceramic sealing used for a semiconductor device having a high power consumption.
According to the present invention, a high-speed storage process of a cache memory is realized by suppressing the increase in the area of the memory cell or the memory cell array.
Also, according to the invention, a high-speed cache memory is realized without increasing the power consumption.
The above and other objects, features and advantages will be made apparent by the detailed description taken below in conjunction with the accompanying drawings.
Representative aspects of the present invention disclosed in this specification are briefly described below.
A semiconductor memory apparatus comprises a memory array (BANK1) including a plurality of word lines (WL), a plurality of bit lines (LBL) and a plurality of memory cells (CELL) arranged at the intersections between the word lines (WL) and the bit lines (LBL), at least a first global bit line (RGBL) connected to a sense amplifier (104), at least a second global bit line (WGBL) connected to a write amplifier (102), and a selection circuit (YSW1) for selectively connecting the bit lines (LBL) to the first global bit line (RGBL) or to the second global bit line (LBL). The first global bit line (RGBL) and the second global bit line (WGBL) are arranged on the memory array (BANK1). When reading data from the memory array (BANK1), the bit lines (LBL) are electrically connected to the first global bit line (RGBL), and the data are output through the sense amplifier (104). When writing the data in the memory cell array (BANK1), on the other hand, the data are input to the second global bit line (WGBL) through the write amplifier (102) with the bit lines (LBL) electrically connected to the second global bit line (WGBL).
At the time of storage when the read and write operations are carried out successively, data are read out using the read global bit line (RGBL) concurrently with the charge and discharge operation of the write global bit line (WGBL). As a result, the write operation can be completed simply by charging and discharging only the local bit lines (LBL) having a small capacity after starting the write operation upon completion of the read operation, thereby making possible a high-speed write operation.
Specifically, in view of the fact that the bit lines can be charged and discharged concurrently for the read and write operations, the continuous read and write operations can be improved in speed and can be completed in a cycle. A one-cycle storage can thus be realized.
Also, since a continuous read and write operation is possible at high speed, the read cycle time is not lengthened even when the cycle time of the read operation is equalized with the cycle time of the continuous write operation. Also, if the cycle time of the read operation is the same as the cycle time of the continuous write operation, a memory such as the microprocessor is more convenient to use as a device to be accessed. Therefore, it is possible to provide a memory in which the cycle time of the read operation is equal to the cycle time of the continuous read and write operation. In other words, the timing specification of a memory can define the same cycle time of the read operation as the cycle time of the continuous read and write operation.
A representative effect obtained by the invention disclosed in this specification are briefly described below.
Specifically, in view of the fact that the bit lines can be charged and discharged concurrently for read and write operations, the continuous read and write operation can be improved in speed and can be completed in a single cycle.