1. Field of the Invention
The present invention relates to a novel logic cell, and more particularly, to a storage element or a latch using the novel logic cell.
2. Background of the Related Art
VLSI technology allows the use of powerful hardware for sophisticated computer applications and multimedia capabilities, such as real-time speech recognition and full-motion video. Recent changes in the computing environment have created a variety of high speed electronics applications. However, there is an increased user desire for portability of computational equipment which places severe restrictions on size, weight, and power. Power consumption is a major consideration in mobile applications since a number of portable applications require low-power and high-throughput simultaneously. For example, notebook and laptop computers require almost the same computational speed and capabilities as desktop machines. Equally demanding are developments in Personal Communications Services (PCS's) such as the digital cellular telephony networks which employ complex speech compression algorithms and sophisticated radio modems.
Further, more power is required for the portable multimedia systems supporting full-motion digital video. Power for video compression and decompression and speech recognition is required on top of the already lean power budget. These portable systems have increased capabilities compared to fixed workstations and are required to operate in a low power portable environment.
Even in non-portable systems, low power consumption is becoming more important. Until recently, power consumption has not been a great concern since the heat generated on-chip can be sufficiently dissipated using a proper package. However, the reduction in the minimum feature size allows implementation of more functional units in a single chip by increasing the number of integrated transistors.
These functional units are usually computation-intensive and operate concurrently. Power consumption increases dramatically in complex VLSI systems such as high performance microprocessors and general-purpose digital signal processors (DSP's). Since the power dissipated in a CMOS digital circuit is proportional to the clock frequency, higher operational speed further increases power consumption.
Further, some adequate cooling techniques, such as using fins and fans, are required to handle increased internal heat. Such techniques increase cost and/or limit the amount of functionalities which can be integrated in a single chip. Hence, reducing power consumption has become a critical concern for designing complex VLSI systems.
There are a variety of considerations that must be taken into account for low power design which include the style of logic used, the technology incorporated, and the architecture employed. Among these, choosing a proper logic style is an important factor for low power since the power consumed in the arithmetic and logical units is greatly dependent on the way in which these blocks are implemented. The logic circuit choice also affects the architectural selection. Hence, there is a need for full exploitation of existing logic circuits to optimize and create a new logic circuit for low power operation.
There are a number of options available in choosing the basic circuit approach and topology of implementing various logic and arithmetic functions. In general, logic families can be divided into two broad categories, depending on the type of operation. The first category is a static logic circuit including standard CMOS logic and pass-transistor logic in which all the internal nodes are static, and thus noise margin is high. The second category is a dynamic logic circuit which uses a precharge technique to improve speed performance. However, the cost increases due to higher design complexity in order to eliminate the problems such as charge sharing due to dynamic operation. U.S. application Ser. No. 08/688,881, which is commonly assigned to the same assignee of this application, describes and illustrates the numerous problems of different static and dynamic logic circuits. The disclosure of U.S. application Ser. No. 08/688,881 is incorporated herein by reference.
Although the conventional logic circuits attempt to reduce the amount of charge consumed in each cycle, power consumption is large, since the charge is repeatly moved from the supply voltage to the ground voltage within a given cycle. Younis and Knight at MIT proposed a method of charge recovering via a new logic family, called Charge Recovering Logic (CRL), which was described in the articles entitled "Practical Implementation of Charge Recycling Asymptotically Zero Power CMOS," Research on integrated systems; Proc. 1993 Symp., Cambridge, Mass. 1993.
The charge recovery technique can achieve energy saving of over 99% when the devices are switched sufficiently slowly. The concept is to create a mirror image of a circuit that computes the inverse of the original, as shown in FIG. 1A. As each stage in the circuit finds an answer, it passes the result on to its mirror image which computes the inverse. In the main circuit, charge moves toward the end, while charge is recycled back to the beginning in the mirror circuit. However, the logic design for implementing the CRL is quite impractical and the anticipated power saving is nearly impossible to be realized in ordinary applications.
Succeeding refinements for saving and reusing only a fraction of the charge seem to be compatible with conventional CMOS technology. An example is a Reduced-Power Buffer (RPB), illustrated in FIG. 1B, which uses storage capacitor to save some of the charges otherwise being dissipated. This circuit includes a driver with an additional storage capacitor attached to the output node through a switch TI. During a high-to-low transition, the circuit saves some of the charge into the storage capacitor Cs, instead of dissipation to the ground. Just before the next low-to-high transition, the saved charge is recycled to the output node.
This scheme is only useful to the applications dominated by switching of large capacitive loads and the storage capacitor must be larger than the load capacitor to obtain sufficient power savings. Another example is a refresh scheme in DRAM to recycle the charge used to refresh cells in one array for use in the other array, which is described in an article entitled "A charge Recycle Refresh for Gb-Scale DRAM's in File Applications," IEEE Journal of Solid State Circuits, Vol. 29, No. 6, June 1994, by Kawahara et al. However, there is no practical charge recycling scheme for general use in logic circuit design.
Synchronous design approaches, which are popularly used in current VLSI design, rely on the clock to synchronize function blocks and storage elements. An efficient clock scheme is always important for designing high performance systems. Currently, there are a variety of different clocking schemes according to several different types of storage elements and logic families.
One of the most popular clocking strategies is a non-overlapping pseudo two-phase clocking scheme which is implemented with a Clocked CMOS (C.sup.2 MOS) latch. The circuit diagram and the clock waveform of this clocking scheme are shown in FIGS. 1C and 1D. The clocking scheme consists of two pairs of clock phases, and in each pair there are two signals which are inverting and noninverting. Thus, up to four clock signals CK1, /CK1, CK2 and /CK2 have to be distributed for routing; a possible skew between these phases can cause serious problems. A great deal of design effort is required to prevent race problems due to the clock skew. A non-overlapping period is introduced as a margin to prevent the skew problems. This non-overlapping period does not contribute to operation time and remains as a dead time which causes a difficulty in increasing clock speed. Moreover, the distribution of multiple clocks uniformly throughout a system increases the design costs, especially in high-speed applications.
The NORA dynamic CMOS technique uses a true two-phase clock signals CK and /CK, instead of using the pseudo two-phase clock signal. The logic structure and the associated clock waveforms are shown in FIGS. 1E and 1F. It can avoid race problems caused by clock skews with some constraints on logic composition. The most important constraint is that between two C.sup.2 MOS latches there must be an even number of inversion blocks. If there are static blocks between a precharge block and a C.sup.2 MOS latch, they must also be of an even number.
The true single-phase clock dynamic CMOS circuit technique, which is shown in FIG. 1G, uses only one clock signal CK. No clock skew exists except for clock delay problems and even a higher clock frequency can be realized with no constraint on logic implementation of the NORA technique. However, this circuit requires a PMOS logic block (p-section) which may cause a speed degradation of the entire system. To solve this problem, a True-single-phase All-N-logic Differential Logic (TADL) has been proposed to use only NMOS-logic blocks in a pipeline configuration by H. Y. Huang et al. in an article entitled "True-single-phase All-N-logic Differential Logic (TADL) for very high-speed complex VLSI," in Proc. IEEE ISCAS, May 1996. However, the proposed circuit merely changes PMOS transistors in the logic network with NMOS transistors to drive logic "high" value. Hence, the speed improvement is not as high as expected. Moreover, all the functionalities in a pipeline section using the TADL technique must be implemented in one stage, which may decrease the logic flexibility.
The above references are incorporated by reference herein where appropriate for appropriate teachings of additional or alternative details, features and/or technical background.