1. Field
This invention relates to sense amplifiers and, in one embodiment, sense amplifiers for deep sub-micron designs.
2. Description of Related Art
A. Overview
Sense amplifiers (“SAs”) were traditionally used in dynamic and static (cache, buffers, etc.) random access memory (RAM) to improve memory-access latency. Due to increasing clock speeds of digital logic circuits and increasing on-chip interconnect parasitics, SAs have also gained popularity in modern digital logic circuit designs. In static and dynamic memory, SAs are used to improve the read-access time. In digital logic, SAs are used to improve the performance of flip-flops (“FFs”), the most commonly used digital building block.
Flip-flops are an integral part of digital design because of their use in retime, deskew and receive circuitry. Previous work on flip-flops included efforts at utilizing SA front-ends to improve sensitivity and speed of conventional flip-flops. The basic SA flip-flop (“SAFF”) circuit contains an SA front-end followed by a latch stage.
Previous work on high-performance SAFFs has concentrated on improving the latch stage and has mostly ignored the SA front-end. The SAFF reported in J. Montanaro, R. T. Witek, et al., “A 160 MHz, 32-b 0.5 W CMOS RISC microprocessor,” IEEE J. Solid-State Circuits, vol. 31, pp. 1703–1714(November 1996), uses a NAND set reset (SR) latch; H. Partovi, R. Burd, et al., “Flow-through latch and edge-triggered flip-flop hybrid elements,” ISSCC Dig. Tech. Papers, pp 138–139 (1996), uses a hybrid-latch flip-flop; a semi-dynamic flip-flop in F. Klass, “Semi-dynamic and dynamic flip-flops with embedded logic,” Symp. VLSI Circuits Dig. Tech Papers, pp. 108–109 (1998), uses dynamic-style flip-flops; B. Nikolic, V. Oklobdzija, V. Stohanovic, W. Jia, “Improved sense-amplifier-based flip-flop: Design and measurements,” IEEE J. of Solid-State Circuits, pp. 876–884, vol. 35 (June 2000) uses a cross-coupled inverter latch; and J. C. Kim, Y. C. Jang and H. J. Park, “CMOS sense-amplifier-based flip-flop with N-C2MOS output latches,” Electronic Letters, vol. 36, pp.498–500 (March 2000), uses a N-C2MOS latch.
Common to all these designs, which are incorporated herein by reference, is a standard sense-amplifier front-end. Each of these types of sense-amplifiers are referred to in this application as a conventional voltage mode sense-amplifier (“CVSA”).
Research on SA designs has analyzed the sensitivity of the CVSA design to process variations such as transistor mismatch and has also improved the original design such as in the case of a PMOS cross-coupled SA (“pCCSA”). These approaches to SA design assumed and were optimized for relatively large transistor geometries and higher voltage supply-rails than commonly in use today.
For a higher-voltage supply-rail, it is important to eliminate DC current paths (zero static power consumption) to achieve low power. In both CVSA and pCCSA, this is achieved by including clocked transistors in the evaluation path. Over the years, despite some optimization that has been made to earlier designs, the core architecture of the clocked voltage SA has remained similar to the initial designs with clocked transistors in the evaluation path.
Today, deep sub-micron CMOS technology has relatively low-voltage supply rails. As a consequence, it is important to reduce the evaluation chain depth to improve circuit performance and process scaling. To employ SAs in high-speed designs, they should also be able to achieve a multi-GHz operating frequency without compromising power, sensitivity and clock-load. Unfortunately, conventional voltage sense-amplifier designs carry a high clock-load burden that limits their frequency scaling. The experimental results for earlier SA designs have therefore been limited to sub-GHz operating frequencies. See B. Nikolic, V. Oklobdzija, V. Stohanovic, W. Jia, “Improved sense-amplifier-based flip-flop: Design and measurements,” IEEE J. of Solid-State Circuits, pp. 876–884, vol. 35 (June 2000), the content of which is incorporated herein by reference.
To improve scaling of SAs into deep sub-micron designs, it is therefore important to reduce the evaluation chain depth, reduce the noise generated by the SA, and reduce the SA clock-load.
B. Detailed Description of Prior Art CVSA and pCCSA
The operation of a clocked sense-amplifier consists of a pre-charge/discharge phase, also known as an equalization phase, and an evaluation phase. To eliminate DC power consumption, commonly used sense-amplifier architectures have a clocked transistor in the evaluation chain. Two of the most popular SA architectures are the conventional voltage sense amplifier (“CVSA”) and pMOS cross-coupled sense-amplifier (“pCCSA”), which is a modified version of the CVSA.                (1) Conventional Voltage Sense Amplifier (CVSA)        
FIG. 1 is a schematic of a prior art conventional voltage-mode sense-amplifier (CVSA). The sense-amplifier has a clocked NMOS transistor MNC in the evaluation path 104 and an evaluation path depth of three. One path is MN3, MN1 and MNC; the other is MN4, MN2 and MNC.
As is well understood by those skilled in the art, VDD is a reference to the supply voltage; φ is a reference to the clock input, D is a reference to the data input; {overscore (D)} is a reference to the complement of the data input; OUT is a reference to the data output; {overscore (OUT)} is a reference to the complement of the data output; the first letter “M” in the labels signifies a transistor; the second letter “P” or “N” specifies that the transistor is a “P” or “N” channel transistor; the “C” subscript specifies that the transistor is managing the clock signal; the “D” subscript specifies that the transistor is managing the data signal; and the subscript number represents an arbitrary number to distinguish each transistor from others of the same type and management function. This same nomenclature is used throughout all of the figures in this application.
During the equalization phase, the clock φ is low; transistor MNC is switched off; transistors MPC1 and MPC2 are switched on; and the nodes OUT and {overscore (OUT)} are precharged to VDD. The high output keeps transistors MN3 and MN4 switched on. Since there is no current path to ground, the intermediate nodes 102 and 103 are precharged to VDD−VthN, where VthN is the threshold voltage of the N transistors.
In addition, node 100 is precharged to VIN—HIGH−VthN, where VIN—HIGH is the higher of the two voltages at inputs D and {overscore (D)}. The lower limit on the size of transistors MPC1 and MPC2 is determined by their ability to fully precharge the nodes in half a clock cycle, i.e., in a single clock phase. The precharge speed of the CVSA in FIG. 1 is proportional to the precharge transistor conductivity of MPC1 and MPC2 and is inversely proportional to the SA load and parasitic capacitance.
At the clock φ goes high, the SA enters the evaluation phase. During the evaluation phase, transistors MPC1 and MPC2 are switched off and transistor MNC is switched on. For small input signals (D−{overscore (D)}=±ΔV), transistors MN1 and MN2 remain switched on and, as a result, there is a current path to ground from both OUT and {overscore (OUT)}. However, due to the voltage differential between the inputs, one of the paths will sink more current compared with the other. The evaluation path 104 connected to the higher input voltage will have higher current-sinking capability, and the output of that evaluation path will get pulled down to ground.
From the previous description, it will be understood by those skilled in the art that the evaluation speed of the SA in FIG. 1 is proportional to the input voltage differential, the conductivity of the evaluation path, and is inversely proportional to the SA load and parasitic capacitance. The conductivity of the evaluation chain is proportional to the nMOS transistor size, and is inversely proportional to the number of series transistors. As seen in FIG. 1, the CVSA has a 3-transistor deep evaluation path 104.
The pMOS and nMOS transistors can be interchanged in the CVSA design to generate a predischarge sense-amplifier. When the evaluation path is changed to pMOS, the transistor sizes must be increased to compensate for the reduced speed of pMOS (nMOS is typically 3–4 times faster than PMOS). Since the parasitic capacitance of the SA increases as an unfortunate side effect of increased pMOS transistor sizes, the actual increase in the pMOS transistor sizes will be greater than the mobility ratio between nMOS and pMOS.
If the parasitic contribution from the evaluation chain is assumed to remain constant, moving from precharge to predischarge results in 3 times the reduction in the predischarge transistor size. As a result, the capacitive contribution from predischarge transistors to the output node capacitance and clock-load will be reduced by a factor of 3. Unfortunately, since the pMOS evaluation transistors need to be resized to achieve the required evaluation speed, moving from precharge to predischarge would be likely to increase the parasitic capacitance of the evaluation chain by more than 3 times. Therefore, an increase in total clock-load capacitance and SA node capacitance is expected. Thus, the predischarge type CVSA will in general under perform its precharge counterpart.
For slower clock speeds, where the parasitic capacitance of the CVSA is smaller than the load capacitance, transistors can be resized to achieve better performance without significantly increasing the total output node capacitance. As a result, at slower clock speeds there is a linear relationship between the CVSA performance and the clock-load. A CVSA designed in 0.18 μm CMOS technology optimized to operate at 1.0 GHz clock frequency has a clock-load that is 1.8 times greater than a CVSA optimized for a 500 MHz clock frequency.
Hspice simulation results for conventional voltage SAs (CVSA) optimized for different operating frequencies simulated in 0.18 μm CMOS are shown in Table 1. The output load is 10.0 fF and the target input sensitivity is 100 m VPP.
TABLE 1Optimized operatingfrequencyDelay (ps)Clock-load (fF)RMS Power (μW)1.5 GHz96102402.0 GHz72184102.5 GHz63276553.0 GHz5739980
On the other hand, for frequencies above 1.0 GHz, transistor resizing is of decreasing benefit. This is because the CVSA output parasitic capacitance is comparable to the load capacitance. Resizing to improve CVSA performance also increases the total CVSA capacitance, limiting the overall performance improvement. At a clock frequency of 1.5 GHz, CVSA output parasitic to load capacitance ratio is 0.75. When the transistors are doubled in size, even though the output node capacitance increases by a factor of 1.4, there is still a 1.4 times improvement in the speed. On the other hand, if the transistors are resized by a factor of 4, due to the 2.3 times increase in the output capacitance, there is only a 1.75 times speed improvement. This analysis agrees with the Hspice simulation results given in Table 1. Compared to the 1.5 GHz CVSA, the 3.0 GHz CVSA achieved a 1.7 times improvement in the evaluation delay at the cost of a 4 times increase in the clock-load. Clearly the CVSA has serious performance limitations as clock frequency is increased.
(2) pMOS and nMOS Cross-coupled Sense-amplifiers (pCCSA and nCCSA)
FIG. 2(a) is a schematic of a prior art PMOS cross-coupled SA design. This is a modification to the conventional architecture and does not include a cross-couple transistor pair in the evaluation path 201a. Since both the CVSA design of FIG. 1 and the cross-coupled SA design in FIG. 2(a) share the same basic design, the operation of the two SAs are similar and also have similar drawbacks.
Eliminating the cross-coupled transistors from the evaluation chain reduces the evaluation path depth and the number of parasitic nodes in the SA. In a pCCSA, when the NMOS cross-coupled transistors MN3 and MN4 in FIG. 1 are eliminated from the evaluation path, the PMOS cross-coupled pair MP1 and MP2 need to be resized to compensate. The resulting increased in the pMOS cross-couple pair MP1 and MP2 increases the SA parasitic capacitance and reduces the SA frequency response.
The frequency response of the cross-couple SA can be improved by swapping the pMOS and nMOS transistors, resulting in the prior art nMOS cross-couple sense amplifier (nCCSA) shown in FIG. 2b, with an evaluation path 201b. 
Compared with a pMOS cross-couple load, an nMOS cross-couple load is able to provide stronger positive feedback. In addition, since the nCCSA uses nMOS predischarge, compared to the pMOS counterpart, the equalization phase has faster response time and lower capacitance. In addition, the analysis of the clock-load characteristics shows that predischarge architecture also reduces the equalization phase clock-load significantly.
Although there is a significant reduction in equalization clock-load, it does not translate into a significant reduction in the total clock-load. In nCCSA, the clocked PMOS evaluation transistor MPC dominates the clock-load, and the capacitance contribution of MPC is 4 times greater than the capacitive contribution of MNC1 and MNC2 for 3.0 GHz SA in 0.18 μm technology.