For its high-speed and short cycle time, the SRAM (Static Random Access Memory) is utilized extensively as a cache memory in computer systems and network systems. Furthermore, the SRAM is simple to use with no refresh operation. As such, the SRAM constitutes a key component that holds sway on the speed and performance of the computer systems and other systems. Efforts of research and development have been under way primarily to boost the operating speed of the memory.
FIG. 1 illustrates a circuit diagram of a conventional SRAM including memory segment, a write circuit and a sense amplifier, as published, U.S. Pat. No. 4,712,194 and No. 6,075,729. The memory block 100 includes memory cells 110, 111, 112 and 113 having six transistors. The memory cells are connected to bit lines 121, 122, 123 and 124, which bit lines are pre-charged by pre-charge circuits 125 and 126, respectively. During standby, the pre-charge circuits 125, 126 and 127 preset the bit lines to high. After then, the bit lines are released from the pre-charge state when read and write. Thus the stored voltage of the memory is transferred to the sense amp 160 through transfer gates. When the memory cell 110 is selected, the transfer gates 141 and 142 are turned on, while the other transfer gates 143 and 144 keep turn-off state. In doing so, the memory cell data is read by the sense amp 160 through the common bit lines 151 and 152. The read output of the sense amp 160 is transferred to output node 190 through a transfer gate 161, while unselected memory block 170 and unselected sense amp 180 are in pre-charge state and transfer gate 181 keeps turn-off state. When write, write buffers 131 and 132 transfer input data to write circuit 133, so that the input data is transferred to the memory cell through bit lines when word lines of the memory cell are asserted to high.
In the conventional SRAM, six-transistor memory cell 110 is used to store data, such that a latch including two cross coupled inverters stores voltage data. In order to achieve fast access, the latch of the memory cell should be strong enough to drive heavily loaded bit line, but the latch should be weak enough to be flipped by the write circuit 133 through the transfer gates 141 and 142. Furthermore, heavily loaded bit lines may flip unselected memory cells during read and write operation. For example, the unselected memory cell 112 receives same word line voltage as that of the selected memory cell 110, so that the memory cell 112 will lose its data when the latch is too weak and the bit line loading is too heavy because both bit lines 123 and 124 are floating from pre-charged high voltage, while the selected bit line 121 and 122 receive input data from write circuit 133. And the pass (transfer) transistor of the memory cell should be strong enough to transfer charges for reading and writing. As a result, the transistors in the memory cell are bigger than minimum feature size within the fabrication process limit typically, which increases the chip area.
For writing data, a write data line pair 134 and 135 is connected to the write circuit 133 and another memory block 170. Conventionally, the write data line pair is heavily loaded with no buffers, so that the write data lines always drive full length of the memory block, which increases driving current and RC delay time. For reading data, a read data line 190 is connected to sense amps 160 and 180. Alternatively, a pair of read data lines can be used typically for amplifying a voltage difference. Thus, access time is different from location of the selected sense amp. For example, access time from the sense amp 160 is faster than that of the sense amp 180, so that it is difficult to latch sense amp output at high speed because a latching clock is fixed (not shown). Furthermore, the read data line is also heavily loaded for connecting to multiple memory blocks with no buffers, which increases driving current and RC delay time as well.
There are many efforts to improve the conventional SRAM with new circuit concepts, such that memory array is multi-divided in order to reduce parasitic loading of bit line by introducing hierarchical bit line architecture and multi-stage sense amp, as published U.S. Pat. No. 7,158,428 and U.S. Pat. No. 6,442,089. However, each memory segment including the bit line comprises more circuits such as a cross-coupled keeper transistor circuit, a local read amplifier circuit, pre-charge transistors, and transfer transistors, which increases chip area. And another prior art is shown, “A Low Power Embedded SRAM for Wireless Applications”, IEEE Solid-States Circuits, Vol. 42, No. 7, July 2007. In this prior art, bit lines are multi-divided but sense amps include more transistors, so that the area may be increased, and write circuit is increased as well. And one more prior art is shown, “A low power SRAM Using Hierarchical Bit Line and Local read amplifiers”, Yang et al, IEEE Journal of Solis-State Circuits, Vol. 40, No. 6, June 2005, such that the local read amplifier improves write operation, but it does not improve read operation because the local read amplifier is not activated during read cycle. As a result, the access time is still slow and area may be increased more.
Furthermore, in the bulk CMOS SRAM, the current driving ability of the load MOS transistors drops if the miniaturization of the memory cell size further advances. If the operation voltage further drops, the amount of charge stored in the storage node drops, so that the potential fluctuation of the storage node due to alpha rays cannot be suppressed, deteriorating the soft error resistance. There are some improvements with capacitor in the memory cell, as published in U.S. Pat. No. 6,972,450, U.S. Pat. No. 5,780,910 and U.S. Pat. No. 5,179,033. However, these approaches solve only memory cell portion, but they don't suggest any new improvements with peripheral circuits such as sense amps, in order to miniaturize the memory cell.
In this respect, there is still a need for improving the static random access memory. In the present invention, high speed SRAM is realized such that bit lines are multi-divided to reduce the parasitic capacitance of the bit line, which realizes high speed write and read operation. For reading the divided bit line more effectively, multi-stage sense amps are used, such that a first dynamic circuit as a local sense amp is connected to memory cells through two local bit lines, a second dynamic circuit as a segment sense amp is connected to the local sense amp through a segment bit line, and a tri-state inverter is connected to the segment sense amp through a global bit line. With dynamic sense amps, penetration current is reduced during sensing, which realizes low power consumption. Furthermore, low voltage operation is available with dynamic circuits because the dynamic circuit detect whether an amplify transistor is turned on or not by a selected memory cell. And with the multi-stage sense amps, a time-domain sensing scheme is realized in order to differentiate low voltage data and high voltage data in the time-domain, which does not require the conventional sense amp, because the multi-stage sense amps convert a voltage difference of the bit line to a current difference, and then the current difference is converted to a time difference. Furthermore, a buffered data path is used for realizing fast write and read operation. Furthermore, the lightly loaded bit line does not disturb the unselected cells when writing and reading. Additionally the SRAM cell includes a stacked capacitor for preserving charges, which increases alpha ray immunity.
The memory cell can be formed on the surface of the wafer. And the steps in the process flow should be compatible within the current CMOS manufacturing environment. Alternatively, the memory cell can be formed from thin film polysilicon layer, because the lightly loaded bit line can be quickly discharged by the memory cell even though the thin film pass transistor can flow relatively low current. In doing so, multi-stacked memory is realized with thin film transistor, which can increase the density within the conventional CMOS process with additional process steps, because the conventional CMOS process is reached to a scaling limit for fabricating transistors on a surface of a wafer. In particular, a body-tied TFT (Thin Film Transistor) transistor can be alternatively used as the thin film transistor for alleviating self heating problem of short channel TFT. In doing so, multi-stacked SRAM is realized with short channel TFT transistor.