VLSI technology allows powerful hardware for sophisticated computer applications and multimedia capabilities, such as real-time speech recognition and full-motion video. Recent changes in the computing environment have created a variety of high speed electronics applications. However, there is an increased user desire for portability of computational equipment, which places severe restrictions on size, weight, and power. Power consumption is a major consideration in mobile applications, since a number of portable applications require low-power and high-throughput, simultaneously. For example, notebook and laptop computers require almost the same computational speed and capabilities as desktop machines. Equally demanding are developments in personal communications services (PCS's), such as the digital cellular telephone networks which employ complex speech compression algorithms and sophisticated radio modems.
Further, more power is required for the portable multimedia systems supporting full-motion digital video. Power for video compression and decompression and speech recognition is required on top of the already lean power budget. These portable systems have increased capabilities than fixed workstations, and are required to operate in a low power portable environment.
Even in non-portable systems, low power consumption is becoming more important. Until recently, power consumption has not been a great concern since the heat generated on-chip can be sufficiently dissipated using a proper package. However, the reduction in the minimum feature size allows implementation of more functional units in a single chip by increasing the number of integrated transistors.
These functional units are usually computation-intensive and operating concurrently, and power consumption increases dramatically in complex VLSI systems, such as high performance microprocessors and general-purpose digital signal processors PSP's). Since the power dissipated in a CMOS digital circuit is proportional to the clock frequency, higher operational speed further increases power consumption.
Further, some adequate cooling techniques, such as using fins and fans, are required to handle increased internal heat. Such techniques increase cost and/or limit the amount of functionalities which can be integrated in a single chip. Hence, reducing power consumption has become a critical concern for designing complex VLSI systems.
There are a variety of considerations that must be taken into account for low power design, which include the style of logic used, the technology incorporated, and the architecture employed. Among these, choosing a proper logic style is an important factor for low power, since the power consumed in the arithmetic and logical units is greatly dependent on the way in which these blocks are implemented. The logic circuit choice also affects the architectural selection. Hence, there is a need for full exploitation of existing logic circuits to optimize and to create a new logic circuit for low power operation.
There are a number of options available in choosing the basic circuit approach and topology of implementing various logic and arithmetic functions. In general, logic families can be divided into two broad categories, depending on the type of operation. The first category is a static logic circuit including standard CMOS logic and pass-transistor logic, in which all the internal nodes are static, and thus, noise margin is high. The second category is a dynamic logic circuit which uses a precharge technique to improve speed performance. However, the cost increases due to higher design complexity in order to eliminate the problems, such as charge sharing due to dynamic operation. U.S. application Ser. No. 08/688,881, which is commonly assigned to the same assignee of this application, describes and illustrates the numerous problems of different static and dynamic logic circuits. The disclosure of U.S. application Ser. No. 08/688,881 is incorporated herein by reference.
Although the conventional logic circuits attempt to reduce the amount of charge consumed in each cycle, power consumption is large, since the charge is repeatedly moved from the supply voltage to the ground voltage within a given cycle. Younis and Knight at MIT proposed a method of charge recovering via a new logic family, called Charge Recovering Logic (CRL), which was described in the articles entitled "Practical implementation of charge recycling Asymptotically zero power CMOS," Research on integrated systems; Proc. 1993 Symp., Cambridge, Mass. 1993. The charge recovery technique can achieve energy saving of over 99% when the devices are switched sufficiently slowly. The concept is to create a mirror image of a circuit that computes the inverse of the original, as shown in FIG. 1A. As each stage in the circuit finds an answer, it passes the result on to its mirror image which computes the inverse. In the main circuit charge moves toward the end, while charge is recycled back to the beginning in the mirror circuit. However, the logic design for implementing the CRL is quite impractical, and the anticipated power saving is nearly impossible to be realized in ordinary applications.
Succeeding refinements for saving and reusing only a fraction of the charge seem to be compatible with conventional CMOS technology. An example is a Reduced-Power Buffer (RPB), illustrated in FIG. 1B, which uses storage capacitor to save some of the charges otherwise being dissipated. This circuit includes a driver with an additional storage capacitor attached to the output node through a switch T1. During a high-to-low transition, the circuit saves some of the charge into the storage capacitor Cs, instead of dissipation to the ground. Just before the next low-to-high transition, the saved charge is recycled to the output node.
This scheme is only useful to the applications dominated by switching of large capacitive loads, and the storage capacitor must be relatively larger than the load capacitor to obtain sufficient power saving. Another example is a refresh scheme in DRAM to recycle the charge used to refresh cells in one array for use in the other array, which is described in an article entitled "A charge Recycle Refresh for Gb-Scale DRAM's in File Applications," IEEE Journal of Solid State Circuits, Vol. 29, No. 6, June 1994, by Kawahara et al. However, there is no practical charge recycling scheme for general use in logic circuit design.
The conventional storage element, device or latch, such as a true single-phase clock (TSPC) latch, is also prone to noise and substantial leakage current. The TSPC latch has severe noise margin problem when it is used with conventional circuits. For example, an internal node may be tristated `high` with an output node being tristated `low` by driving a logic `one` signal to an input electrode when the latch is opaque. Such a condition is similar to a dynamic node driving a dynamic gate that is very sensitive to noise. Thus, the noise can reduce the voltage on the internal node enough to cause a substantial leakage current through a PMOS transistor, and change the logic state at the output node. This problem and solution are addressed in an article by D. W. Dobberpuhl et al. entitled "A 200-MHz 64-b dual-issue CMOS microprocess," published in IEEE J. Solid-State Circuits, Vol. 27, No. 11, pages 1555-1564, November 1992.