The present invention relates to a logic circuit design and, more particularly, to a logic circuit design for combinatorial and asynchronous logic circuits.
A large body of research has been performed to develop and improve traditional Complementary Metal Oxide Semiconductor (CMOS) techniques for the production of integrated circuits (ICs). The object of this research is to develop a faster, lower power, and reduced area alternative to standard CMOS logic circuits (see A. P. Chandrakasan, S. Sheng, R. W. Brodersen, “Low-Power CMOS Digital Design”, IEEE Journal of Solid-State Circuits, vol. 27, no. 4, pp. 473-484, April 1992, and in A. P. Chandrakasan, R. W. Brodersen, “Minimizing Power Consumption in Digital CMOS Circuits”, Proceedings of the IEEE, vol. 83, no. 4, pp. 498-523, April 1995.) This research has resulted in the development of many logic design techniques during the last two decades. One popular alternative to CMOS is pass-transistor logic (PTL).
Formal methods for deriving pass-transistor logic are known for Negative-channel Metal Oxide Semiconductor (NMOS) transistors. The logic circuits resulting from these known methods yield an NMOS PTL logic circuit having a set of control signals applied to the gates of NMOS transistors, and a set of data signals applied to the sources of the n-transistors. Many PTL circuit implementations have been proposed in the literature (see also W. Al-Assadi, A. P. Jayasumana and Y. K. Malaiya, “Pass-transistor logic design”, International Journal of Electronics, 1991, vol. 70, no. 4, pp. 739-749, K. Yano, Y. Sasaki, K. Rikino, K. Seki. “Top-Down Pass-Transistor Logic Design”, IEEE Journal of Solid-State Circuits, vol. 31, no. 6, pp. 792-803, June 1996, R. Zimmermann, W. Fichtner, “Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic”, IEEE Journal of Solid-State Circuits, vol. 32, no. 7, pp. 1079-1090, June 1997, and K. Bernstein, L. M. Carrig, C. M. Durham and P. A. Hansen, “High Speed CMOS Design Styles”, Kluwer Academic Press, 1998, and K. Bernstein, L. M. Carrig, C. M. Durham and P. A. Hansen, “High Speed CMOS Design Styles”, Kluwer Academic Press, 1998).
Some of the main advantages of PTL over standard CMOS design are: high speed due to the small node capacitances; low power dissipation as a result of the reduced number of transistors; and lower interconnection effects due to a small area.
Most PTL implementations, however, have two basic problems. First, the threshold drop across the single-channel pass transistors results in reduced current drive and hence slower operation at reduced supply voltages. This drop is particularly important for low power design since it is desirable to operate at the lowest possible voltage level. Second, since the input voltage for a high logic level at the regenerative inverters is not VDD, the PMOS device in the inverter is not fully turned off, and hence direct-path static power dissipation can be significant.
There are many PTL techniques that attempt to solve the problems mentioned above. Some of them are: Transmission Gate CMOS (TG), Complementary Pass-transistor Logic (CPL), and Double Pass-transistor Logic (DPL). TG uses transmission gate logic to realize complex logic functions using a small number of complementary transistors. TG solves the problem of low logic level swing by using PMOS as well as NMOS transistors. CPL features complementary inputs/outputs using NMOS pass-transistor logic with CMOS output inverters. CPL's most important feature is the small stack height and the internal node low swing, which contribute to lowering the power consumption. The CPL technique suffers from static power consumption due to the low swing at the gates of the output inverters. To lower the power consumption of CPL circuits, latched complementary pass-transistor logic (LCPL) and swing restored pass-transistor logic (SRPL) circuit styles are used. These styles contain PMOS restoration transistors or cross-coupled inverters respectively. DPL uses complementary transistors to keep full swing operation and reduce the DC power consumption, eliminating the need for restoration circuitry. One disadvantage of DPL is the large area required by the presence of PMOS transistors.
An additional problem of existing PTL is the top-down logic design complexity, which prevents the pass-transistors from capturing a major role in real logic large-scale integration technology (LSI). One of the main reasons for this is that no simple and universal cell library is available for PTL based design. Not all variations of input values to a basic PTL cell produce well-defined logic values. This creates difficulties in the development of automatic design systems for PTL logic, and in the verification of PTL logic circuit performance.
Asynchronous logic design has been established as a competitive alternative to synchronous circuits thanks to the potential for high-speed, low-power, reduced electromagnetic interference, and timing modularity (see J. Sparsø and S. Furber (eds.), Principles of asynchronous circuit design—A systems perspective, Kluwer Academic Publishers, 2001). Asynchronous logic has been developed in the last decade to deal with the challenges posed by the progress of very large-scale integration (VLSI) technologies, together with the increasing number of gates on chip, high density, and GHz operation frequencies. These problems are expected to appear in future high-performance technologies operating at the 10 GHz barrier, due to the increased influence of interconnect on signal delay, uncertainty in the delay of a given gate, and on-chip parameter variations. These factors create difficulties in the design of fast digital systems controlled by a single general clock, due to considerations of delay skew between distant logic blocks, as well as the complexity of design of structures controlled by multiple clocks.
Asynchronous design provides digital systems based on self-timed circuits, which demand no control of a general clock, along with fast communication protocols in which speed depends only on the self delay of the logic gates. The absence of a general clock contributes to low power operation, by eliminating the concentrated power consumption of certain chip areas where numerous transactions occur with arrival of each clock signal.
However, these desirable characteristics usually come at a cost of either silicon area, or speed, or power, and cannot be achieved all at once. Furthermore, asynchronous circuits are typically more complicated than their synchronous counterparts.
In U.S. patent application Ser. No. 10/648,474 Morgenshtein et al. present a fast and versatile logic circuit, denoted GDI, with reduced area and power requirements, and capable of implementing a wide variety of logic functions. The GDI logic technique is based upon a basic GDI logic cell, which is shown in FIG. 1 and described in detail below. However, the generally applicability of the previously proposed GDI logic cell for logic circuit design is limited to a narrow range of CMOS technologies due to bulk effects, as discussed below. Many GDI cell topologies cannot be implemented in standard p-well or n-well CMOS technology.
Reference is now made to FIG. 1, which is a simplified block diagram of a logic circuit. The logic circuit, which uses a previously proposed GDI design, is based upon two complementary transistor networks, which connect to the previously proposed GDI circuit logic inputs and outputs, and implement the desired logic function. The relationship between the structures of the two transistor networks and the overall function of the previously proposed GDI circuit is discussed below, for the general case and for specific transistor network configurations.
Logic circuit 100 contains P logic block 110, N logic block 120, first and second logic inputs, 130 and 140, and three logic terminals: first and second dedicated logic terminals, 150 and 160, and common diffusion logic terminal 170. The first and second dedicated logic terminals, 150 and 160, and the common diffusion logic terminal 170 can each serve as either a logic signal input terminal or a logic signal output terminal, depending upon the specific logic circuit implementation. The examples given below illustrate several logic circuit terminal configurations.
The P logic block 110 contains a network of p-type transistors 180 which are interconnected to implement a given logic function. The P logic block 110 has three logic connections: an outer diffusion connection 181 (at an outer diffusion node of one of the p-type transistors), a gate connection 182 (at the gate of one of the p-type transistors), and an inner diffusion connection 183 (at the second inner diffusion node of one of the p-type transistors). Outer diffusion connection 181 connects to the first dedicated logic terminal 150, and gate terminal 182 connects to the first logic input 130. The N logic block 120 contains a network of n-type transistors 190 which implement the complementary logic function, and is structured similarly to the P logic block 110. The inner diffusion nodes of the P and N logic blocks, 183 and 193, are connected together to form the common diffusion logic terminal 170.
The p-type and n-type transistors may be field effect transistors (FET), CMOS transistors (p-well, n-well, or twin-well), SOI transistors, SOS transistors, or the like. However p-well and n-well CMOS transistors may be used only for a limited number of logic circuit configurations. Note that the customary distinction between the source and drain of the transistor can not be made with the previously proposed GDI structure, since for any given transistor the relative voltages between the transistor diffusion nodes changes depending upon the logic input and output voltages. This is in contrast with the standard complementary CMOS structure in which the source or drain is tied to a constant voltage. Thus, for previously proposed GDI logic circuits one of the two transistor diffusion nodes (not the gate) is arbitrarily selected to serve for the inner diffusion connection, and the other to serve for the outer diffusion connection. Many of the previously proposed GDI cell topologies can be implemented in standard p-well or n-well CMOS technology, due to interference of bulk effects under certain input/output conditions. Previously proposed GDI logic circuits are therefore preferably implemented in either twin-well CMOS or silicon-on-insulator/silicon-on-sapphire (SOI/SOS) technologies, which do not suffer from these limitations.
In the previously proposed GDI logic circuit, the common logic terminals are connected together to form a common logic input 196. Thus a logic signal at the common logic input 196 is applied to both the P and N logic blocks, 110 and 120. In one configuration known as a previously proposed double-gate-input GDI circuit (previously proposed GDI*), the logic input terminals, 130 and 140, are not connected, and each logic block has an independent logic input. The previously proposed GDI* circuit is discussed in greater detail below (see FIG. 8).
A dual-transistor configuration of the previously proposed GDI logic circuit is designated herein as the previously proposed GDI logic cell. Reference is now made to FIG. 2, which is a simplified circuit diagram of a standard previously proposed GDI logic cell. In the standard previously proposed GDI logic cell 200, the p-type and n-type transistor networks each contain a single transistor, 210 and 220 respectively. The previously proposed GDI cell has a common input terminal (G) 230 connected to the gates of both the NMOS and PMOS transistors, a first dedicated logic terminal (P) 240 at the outer diffusion node of the PMOS transistor, and a second dedicated logic terminal (N) 250 at the outer diffusion node of the NMOS transistor 220. The common diffusion logic terminal (D) 260 is connected to the inner diffusion nodes of both transistors. The first and second dedicated logic terminals, 240 and 250, and the common diffusion logic terminal 260 may be used as either input or output ports, depending on the circuit structure. FIG. 2 omits bulk connections, although such connections may be required for some transistor technologies, including CMOS. The circuit diagrams for the previously proposed GDI logic circuits presented below have transistor bulk connections, and are therefore appropriate only for technologies with four-terminal transistors (i.e. transistors having gate, drain, source and bulk terminals), such as twin-well CMOS and SOI. Bulk connections may not be needed for some transistor technologies, such as floating bulk SOI.
Table 1 shows six logic functions which can be implemented with a single previously proposed GDI logic cell. The most general case is the multiplexer (MUX), where logic signal A is applied to the common input 230. Signal A selects one of the dedicated logic terminals, 240 or 250, and the logic cell outputs the selected logic signal at the common diffusion logic terminal 260. Other configurations listed in the table implement OR, AND, and inverter logic gates. The logic cell also implements the F1 function (ĀB) and the F2 function (Ā+B). Both the F1 and F2 functions are complete logic families, which can be used to realize any possible logic function.
TABLE 1N (1stP (2ndGdedicat.)dedicat.)(Cmn.)DFunctionLowBAĀBF1BHighAĀ + BF2HighBAA + BORBLowAABANDCBAĀB + ACMUXLowHighAĀNOT
Many of the logic circuits presented below are based on the F1 and F2 functions. The reasons for this are as follows. First, as mentioned, both F1 and F2 are complete logic families. Additionally, F1 is the only GDI function that can be used for higher level circuit design that can be realized in a standard n-well CMOS process. In the F1 function implementation, the bulks of all NMOS transistors are constantly and equally biased, since the N terminal (first dedicated logic terminal) is tied low for all logic input levels. In the other configurations listed in Table 1 the N terminal is either tied high (OR gate), or varies according to the logic input levels (F2, AND, and MUX). Similarly, F2 can be realized in p-well CMOS. Finally, when the N input is driven at a high logic level and the P input is at low logic level, the diodes between NMOS and PMOS bulks to the logic circuit output are directly polarized, and the two dedicated logic terminals are shorted together. Being driven in such a way causes static power dissipation and an output voltage Vout˜0.5VDD. Utilizing the OR, AND and MUX implementations, in standard CMOS with VBS=0 configuration, as building blocks for more complex logic circuits is therefore problematic. The polarization effect can be reduced if the design is performed in floating-bulk SOI technologies, in which case floating-bulk effects have to be considered.
The previously proposed GDI cell 200 differs significantly from the standard CMOS inverter, which it resembles structurally. Dedicated logic inputs 240 and 250 serve as logic signal inputs, not for applying pull-up and pull-down voltages as in the CMOS case. By extending the complementary structure to a three input structure, a much more versatile logic cell is obtained. A simple change of the input configuration of the previously proposed GDI cell 200 corresponds to different Boolean functions. Most of these functions are complex (6-12 transistors) in CMOS, as well as in standard PTL implementations, but require only 2 transistors as a previously proposed GDI logic circuit. Additionally, the bulks of transistors 210 and 220 may be connected to dedicated logic terminals 240 and 250 respectively, so that the transistors 210 and 220 can be arbitrarily biased. This is in contrast with a CMOS inverter, which cannot be biased.
The previously proposed GDI cell structure provides advantages over both CMOS and PTL logic circuits in design complexity, transistor count and power dissipation. An operational analysis of the previously proposed GDI logic cell is now presented, in which previously proposed GDI circuit transient behavior, swing restoration, and switching characteristics are analyzed.
One of the common problems of PTL design methods is the low swing of output signals because of the threshold drop across the single-channel pass transistors. In existing PTL techniques additional buffering circuitry is used to overcome this problem. The following analysis of the low swing performance of the previously proposed GDI cell is based on the F1 function, and can be easily extended for other previously proposed GDI functions. Table 2 presents a full set of logic states and the related functionality modes for the F1 function.
TABLE 2GPFunctionalityD00PMOS Trans GateVTp01CMOS Inverter110NMOS Trans Gate011CMOS Inverter0
As can be seen from Table 2, G=0, P=0 is the only state where low swing occurs in the output value. In this case the voltage level of F1 is VTp (instead of the expected 0V), because of the poor high-to-low transition characteristics of PMOS pass-transistors (see W. Al-Assadi, A. P. Jayasumana and Y. K. Malaiya, “Pass-transistor logic design”, International Journal of Electronics, 1991, vol. 70, no. 4, pp. 739-749, contents of which are hereby incorporated by reference). The only case (from amongst all the possible transitions) where the effect occurs is the transition from G=0, P=VDD to G=0, P=0.
Note that in approximately half of the cases (for P=1) the previously proposed GDI cell operates as a regular CMOS inverter, which is widely used as a digital buffer for logic level restoration. In some of these cases, when VDD is high and there is no swing drop from the previous stages, the previously proposed GDI cell functions as an inverter buffer and recovers the voltage swing. Although this creates a self swing-restoration effect in certain cases, the previously proposed GDI logic circuits shown below assume worst-case swing effects, and contain additional circuitry for swing restoration.
The exact transient analysis for basic previously proposed GDI cell, in most cases, is similar to a standard CMOS inverter. CMOS transient analysis is widely presented in the literature. The classic analysis is based on the Shockley model, where the drain current ID is expressed as follows:
                              I          D                =                  {                                                                                                                I                      D0                                        ⁡                                          (                                              W                        L                                            )                                                        ⁢                                      λ                                          (                                                                                                    q                            ⁢                            V                                                    GS                                                KT                                            )                                                                                                                                                                                                                                                                                                        (                                                                                                                                          V                            GS                                                    ≤                                                                                    V                              TH                                                        ⁢                                                          :                                                        ⁢                                                            sub-threshold                                                                                                                                                                                          region                                                                              )                                                                                                      K                  ⁢                                      {                                                                                            (                                                                                    V                              GS                                                        -                                                          V                              TH                                                                                )                                                ⁢                                                  V                          DS                                                                    -                                              0.5                        ⁢                                                  V                          DS                          2                                                                                      }                                                                                                                                                                                                                                                                                    (                                                                                                                                          V                            DS                                                    <                                                                                    V                              GS                                                        -                                                                                          V                                TH                                                            ⁢                                                              :                                                                                                                                                                                                                                                                                                                                  ⁢                                                      linear                            ⁢                                                                                                                  ⁢                            region                                                                                                                                )                                                                                                      0.5                  ⁢                                                            K                      ⁡                                              (                                                                              V                            GS                                                    -                                                      V                            TH                                                                          )                                                              2                                                                                                                                                                                                                                                                                    (                                                                                                                                          V                            DS                                                    ≥                                                                                    V                              GS                                                        -                                                                                          V                                TH                                                            ⁢                                                              :                                                                                                                                                                                                                                                  saturation                          ⁢                                                                                                          ⁢                          region                                                                                                      )                                                              }                                    (        1        )            where K is a drivability factor, VTH is a threshold voltage, W is a channel width and L is a channel length.
In contrast with the CMOS inverter analysis (see V. Adler, E. G. Friedman, “Delay and Power Expressions for a CMOS Inverter Driving a Resistive-Capacitive Load”, Analog Integrated Circuits and Signal Processing, 14, 1997, pp. 29-39, contents of which are hereby incorporated by reference), where VGS is used as an input voltage, in most previously proposed GDI circuits the voltage input variable to the Shockley model is VDS, the drain-source voltage. The following analysis presents the aspects in which previously proposed GDI differs from CMOS.
Reference is now made to FIG. 3, which shows the previously proposed GDI circuit diagram and transient response when a step signal is supplied to the first dedicated logic terminal 310 of the previously proposed GDI cell 300. The applied step signal causes a response, during which the NMOS transistor 330 passes from the saturation to the sub-threshold region, and a swing drop in output occurs. The transient analysis assumes a fast input transition, so that the linear region is ignored. Analytical expressions that describe the transient response can be derived from (1), for a capacitive load, CL 350, at the output. The capacitive current is:
                                                                        I                C                            =                              C                ⁢                                                      ⅆ                                          V                      S                                                                            ⅆ                    t                                                                                                                          =                              I                D                                                                        (        2        )            where C is the output capacitance, VS is the voltage across the capacitance CL. IC is the current charging the capacitor, which is equal to ID, the drain current through the N-channel device.
The expression for VS as a function of time is:
In the saturation region:
                                                                        C                ⁢                                                      ⅆ                                          V                      S                                                                            ⅆ                    t                                                              =                              0.5                ⁢                                                      k                    ⁡                                          (                                                                        V                          GS                                                -                                                  V                          T                                                                    )                                                        2                                                                                                        =                              0.5                ⁢                                                      k                    ⁡                                          (                                                                        V                          DD                                                -                                                  V                          T                                                -                                                  V                          S                                                                    )                                                        2                                                                                        (        3        )            where, in the case of previously proposed GDI cells linked through diffusion inputs, the capacitance C includes both diffusion and well capacitances of the driven cell.
The integral form of (3) is:
                              ∫                                    ⅆ                              V                S                                                    0.5              ⁢                                                k                  ⁡                                      (                                                                  V                        DD                                            -                                              V                        T                                            -                                              V                        S                                                              )                                                  2                                                    =                  ∫                                    ⅆ              t                        C                                              (        4        )            
The same expression can be written as:
                                          ∫                                          ⅆ                                  V                  S                                                                                                  a                    ⁢                    V                                    S                  2                                +                                                      b                    ⁢                    V                                    S                                +                c                                              =                      ∫                          ⅆ              t                                      ⁢                                  ⁢        where                            (        5        )                                          a          =                                    0.5              ⁢              k                        C                          ,                                  ⁢                  b          =                                    -                              k                ⁡                                  (                                                            V                      DD                                        -                                          V                      T                                                        )                                                      C                          ,                                  ⁢                  c          =                                    0.5              ⁢                                                k                  ⁡                                      (                                                                  V                        dd                                            -                                              V                        T                                                              )                                                  2                                      C                                              (        6        )            a, b and c in (6) are constants of the process or the given circuit. The final expression for the transient response in the saturation region is:
                              t          +                      k            1                          =                              1                                                            b                  2                                -                                  4                  ⁢                  a                  ⁢                                                                          ⁢                  c                                                              ⁢                      ln            ⁡                          (                                                                    2                    ⁢                                          aV                      s                                                        +                  b                  -                                                                                    b                        2                                            ⁢                      4                      ⁢                      a                      ⁢                                                                                          ⁢                      c                                                                                                            2                    ⁢                                          aV                      s                                                        +                  b                  +                                                                                    b                        2                                            ⁢                      4                      ⁢                      a                      ⁢                                                                                          ⁢                      c                                                                                  )                                                          (        7        )            where t is time in saturation region, and k1 is a constant of integration and is calculated for initial conditions (t=0, VS=0). The solution of (7) is obtained numerically (e.g. in MATLAB) for specific values of a, b, and c.
After entering the sub-threshold region, VS continues rising while the output capacitance is charged by ID according to (1):
In the sub-threshold region:
                              C          ⁢                                    ⅆ                              V                S                                                    ⅆ              t                                      =                                                            I                                  D                  ⁢                                                                          ⁢                  0                                            ⁡                              (                                  W                  L                                )                                      ⁢                          λ                              (                                                      qV                    GS                                    /                  kT                                )                                              =                                                    I                                  D                  ⁢                                                                          ⁢                  0                                            ⁡                              (                                  W                  L                                )                                      ⁢                                          λ                                  (                                                            qV                      DD                                        /                    kT                                    )                                                            λ                                  (                                                            qV                      S                                        /                    kT                                    )                                                                                        (        8        )                                          ∫                                    ⅆ                              V                S                                      ⁢                                          λ                                  (                                                            qV                      S                                        /                    kT                                    )                                            ·              A                                      =                  ∫                      ⅆ            t                                              (        9        )            where T is the temperature in degrees Kelvin, k is Boltzmann's constant, q is the charge of an electron, and A is a constant:
                    A        =                  C                                                    I                                  D                  ⁢                                                                          ⁢                  0                                            ⁡                              (                                  W                  L                                )                                      ⁢                          λ                              (                                                      qV                    DD                                    /                  kT                                )                                                                        (        10        )            
The expression for the response in the sub-threshold region is:
                              t          +                      k            2                          =                                            λ                              (                                                      qV                    S                                    /                  kT                                )                                                    q              /              kT                                ·          A                                    (        11        )                                          k          2                =                                            λ                              (                                                      q                    ⁡                                          [                                                                        V                          DD                                                -                                                  V                          T                                                                    ]                                                        /                  kT                                )                                                    q              /              kT                                ·          A                                    (        12        )            where k2 is a constant of integration defined by the initial conditions, A is calculated in (10), and VT is the threshold voltage.
The analysis of propagation delay of a basic previously proposed GDI cell given by equations (2-7) can be refined by taking into account the effect of the diode between the NMOS source and body. This diode is forward biased during the transient (see FIG. 2). By conducting an additional current, the diode contributes to charging the output capacitance CL. The diode's current contribution can be calculated as:IBS=I0(λq[VDD−VS]/nkT)−1)  (13)where IBS is the diode current, I0 is the reverse current, and n is a factor between 1 and 2. The IBS current should be added to equation (2) to derive an improved propagation delay, indicating a faster transient operation of previously proposed GDI cell.
The swing restoration performance of previously proposed GDI circuits is calculated taking into account the area (power) and circuit frequency (delay) constraints. The simplest method of swing restoration is to add a buffer stage after every previously proposed GDI cell. The addition of a buffer stage prevents the voltage drop, but requires greater previously proposed GDI circuit area and increases circuit delay and power dissipation, making such a simplified method highly inefficient. Various buffering techniques are presented in the literature.
Given a clocked logic circuit with known Tcycle and Tsetup, buffering of cascaded previously proposed GDI cells is optimal if the following effects are taken into consideration:
1. Successive Swing Restoration—When cascading previously proposed GDI cells, each cell contributes a voltage drop in the output, that is equal to Vdrop. Assuming 0.3 VDD as a maximal allowed voltage drop of the whole cascade, the number of linked previously proposed GDI cells between two buffers is limited by:
                              N          1                =                              0.3            ⁢                          V              DD                                            V            drop                                              (        14        )            
As shown in FIG. 3, after exiting the saturation area, the value of Vdrop is equal to VTH, and decreases with time as follows, using (9):
                                                                        V                drop                            =                                                V                  DD                                -                                  V                  S                                                                                                        =                                                V                  DD                                -                                                      ln                    ⁡                                          (                                                                                                    (                                                          t                              +                                                              k                                2                                                                                      )                                                    ·                                                      q                            /                            kT                                                                          A                                            )                                                                            q                    /                    kT                                                                                                          (        15        )            Equation (15) applies to the sub-threshold region only, namely for VS<VDD.
According to (15), remaining in the sub-threshold region for (t+k2) assures a significant decrease of Vdrop, and as a result an increase in the number of linked cells, N1. Successive swing restoration can thus be achieved with fewer buffers. FIG. 4 presents Cadence Spectre simulation results of the response of a previously proposed GDI AND gate to a 0-3.3 V step input, for a gate operating in the sub-threshold region with a VDD of 3.3 V.
Interconnection effects can cause a drop in signal potential level, particularly over long interconnects. Where maintaining signal levels is essential, expression (15) may be extended to take into account the interconnection drop IR (where R is the interconnect resistance and I is the current through the interconnect).
Accordingly, suppose the VDD voltage is applied to the drain input of the NMOS transistor through a long wire. For a wire with given width, W, and length, L, the resistance of the interconnect wire is given by:
                    R        =                              ρ            square                    ·                                    L              wire                                      W              wire                                                          (        16        )            where ρsquare is a metal sheet resistance per square.
The current flowing through the wire Iwire and causing the voltage drop is given by:
                              I          wire                =                                            V              DD                        -                          V              drain                                R                                    (        17        )            
Vdrain is determined by the equalization between the wire and NMOS transistor currents as follows:
                                                        V              DD                        -                          V              drain                                R                =                              I            D                    ⁡                      (                          V              drain                        )                                              (        18        )            where ID(Vdrain) is found from (1) according to the operation region of the transistor. Equation (18) can be solved numerically, and its contribution to the final voltage drop expression is given by:Vdrop=Vdrop+(VDD−Vdrain)  (19)where Vdrop is given by (15).
Operation in the sub-threshold region increases delay. The above method is therefore primarily suitable for low-frequency design.
Scaling, namely VDD reduction and threshold non-scalability, influences the number of required buffers for previously proposed GDI circuit architecture according to (14). As a result, in order to remain with the same technology and VT when operating with lower supply voltages additional buffers may be required. The direct impact of adding buffers is primarily on circuit area and the number of gates.
Finally, the following points are noted concerning the buffer insertion topology in previously proposed GDI. Buffer insertion need be considered only when linking previously proposed GDI cells through diffusion inputs. No buffers are needed before gate inputs of previously proposed GDI cells. Due to this feature, the “mixed path” topology can be used as an efficient method for buffer insertion. The number of buffers may be reduced by alternately involving diffusion and gate inputs in a given signal path. The circuit designer can trade off between buffer insertion, and delay, area and power consumption, to achieve efficient swing restoration.
2. Impacts of process variation on swing restoration—In every VLSI process there are variations in parameters such as threshold tracking, and ID0. The process dependence of VTH and ID0 influences the value of Vdrop and the swing restoration in previously proposed GDI. This effect can be best described by defining a sensitivity of Vdrop to the mentioned parameter variations as follows:
                              Current          ⁢                                          ⁢          sensitivity          ⁢                                          ⁢          of          ⁢                                          ⁢          Vdrop                =                              ∂                          V              drop              ′                                            ∂                          I                              D                ⁢                                                                  ⁢                0                                                                        (        20        )                                          Threshold          ⁢                                          ⁢          sensitivity          ⁢                                          ⁢          of          ⁢                                          ⁢          Vdrop                =                              ∂                          V              drop              ′                                            ∂                          V              TH                                                          (        21        )            where Vdrop is given by (19).
3. Maximal cascade delay constraint—The signal path in a cascade of previously proposed GDI cells can be represented by a single-branch RC tree. FIG. 5 shows a previously proposed GDI cascade represented as an RC tree, where Ri are the effective resistances of the conducting transistors, and Ci are the capacitive loads caused by following devices.
A resistance Rii is defined as the resistance of the path between the input and the output (for an RC tree without side branches). Rkk is the resistance between the input and node k. Ck is the capacitance at node k.
The following times are defined in order to derive bounds for the delay of the RC tree:
                              T          D                =                              ∑            k                    ⁢                                          ⁢                                    R              kk                        ⁢            C                                              (        22        )                                          T          R                =                              (                                          ∑                k                            ⁢                                                          ⁢                                                R                  kk                  2                                ⁢                                  C                  k                                                      )                    /                      R            ii                                              (        23        )            
The maximal delay of the RC tree can be derived numerically from the bounds on the time of equations (22) and (23), and is given by the following equation:t≦TD−TR−TD ln[1−νi(t)]  (24)
The number of stages N2 in a previously proposed GDI cascade can be found for a maximal total delay time Tdelay, while using the condition:Tcycle−Tsetup≧Tdelay  (25)
Notice that (25) can be checked only after a value for N2 has been assumed and a suitable RC tree has been built.
In order to obtain satisfactory performance the number of stages between buffers should be limited to satisfy both the successive swing restoration and the maximal delay requirements. The maximal number of stages in cascade between two buffers is therefore the minimal value between N1 (given by (14)) and N2.
A comparison was also made between the switching characteristics of previously proposed GDI vs. CMOS. Due to the complexity of logic functions that can be implemented in previously proposed GDI cell by using only two transistors, the previously proposed GDI cell's switching characteristics were compared to a CMOS gate whose logic function is of the same order of complexity. While the previously proposed GDI cell's structural characteristics are close to a standard CMOS inverter, the gate with equivalent functional complexity in CMOS is a NAND gate. A comparison of switching characteristics was therefore performed between the previously proposed GDI cell and a CMOS NAND gate. The switching behavior of the inverter can be generalized by examining the parasitic capacitances and resistances associated with the inverter. This comparison can be used as a base for delay estimation in early stages of circuit design.
Reference is now made to FIG. 6, which shows the structure of a previously proposed GDI (or prior-art CMOS) inverter 600, along with its equivalent digital model 610. The digital model of the previously proposed GDI inverter consists of three parallel branches between VDD and ground. Two of the branches each consist of two capacitors in series (Cinn and Cinp for the first branch, and Coutn and Coutp for the second branch), with an inverter input between Cinn and Cinp. The third branch consists of two resistors (Rn and Rp) in series, with the inverter output between the two resistors. The propagation delay for an inverter driving a capacitive load is:tPHL=Rn·Ctot=Rn·(Cout+Cload)  (26)where Ctot is the total capacitance on the output of the inverter, that is the sum of the output capacitance of the inverter, any capacitance of interconnecting lines, and the input capacitance of the following gate(s).
Reference is now made to FIG. 7 which shows a circuit diagram of a CMOS NAND gate 700, along with its equivalent digital model 710. The NAND gate consists of identical n-channel metal-oxide-semiconductor FETs (MOSFETs), 720.1 to 720.n, connected in series. As shown in R. J. Baker, H. W. Li and D. E. Boyce, “CMOS Circuit Design, Layout, and Simulation”, IEEE Press Series on Microelectronic Systems, pp. 205-242, contents of which are hereby incorporated by reference, the intrinsic switching time of series-connected MOSFETs with an external load capacitance may be estimated by:
                              t          PHL                =                              N            ·                          R              n                        ·                          (                                                                    C                    out                                    N                                +                                  C                  load                                            )                                +                      0.35            ·                          R              n                        ·                                                            C                  inn                                ⁡                                  (                                      N                    -                    1                                    )                                            2                                                          (        27        )            The first term in (27) represents the intrinsic switching time of the series connection of N MOSFETs, while the second term represents the RC delay caused by Rn charging Cinn.
For Cinn equal to 3/2·Cox, and assuming two serial n-MOS transistors, the propagation delay of the NAND gate is:tPHL=1.52·Rn·Cout+2·Rn·Cload  (28)The ratio of the delay of a CMOS NAND to the delay of a previously proposed GDI cell is
            t              PHL        ⁡                  (          CMOS          )                            t              PHL        ⁡                  (          GDI          )                      ,and is approximated by:
                    1.52        ≤                              t                          PHL              ⁡                              (                CMOS                )                                                          t                          PHL              ⁡                              (                GDI                )                                                    ≤        2                            (        29        )            The delay ratio is bounded above by 2 for a high load, and is bounded below at 1.52 for a low load.
Note, that this ratio improves if the effect of the body-source diode in previously proposed GDI cell is considered (14), and if the delay formula in (7) is refined by including a bulk-source conduction current in (13).
For the analysis of fan-out bounds, the dual-transistor previously proposed GDI cell is compared to CMOS gates with equivalent functional complexity. This approach allows definition of fan-out bounds using the logic-effort concept of I. Sutherland, B. Sproull and D. Harris, “Logical Effort—Designing Fast CMOS Circuits”, Morgan Kaufmann Publishers, p. 7, contents of which are hereby incorporated by reference. The relationship between the logic effort, fan-out, and effort delay of a logic gate is given by:f=g·h  (30)where f is the effort delay, g is the logic effort, and h represents the fan-out of the gate. For a desired delay, reducing the logic effort results in an improved fan-out by the same ratio.
Values of logic effort are given by Sutherland for the inputs of various static CMOS gates normalized relative to the logic effort of an inverter. While a previously proposed GDI cell's logic effort is close to a standard inverter, the equivalent logic functions in CMOS are NAND, NOR or MUX, depending upon the previously proposed GDI cell input configuration (see Table 1). Using Sutherland's logic effort values, the fan-out improvement factor for a previously proposed GDI cell over CMOS are as follows: 4/3 for F1 and F2 vs. CMOS NAND; 5/3 for F1 and F2 vs. CMOS NOR; 2 for previously proposed GDI MUX vs. CMOS MUX.
The above fan-out improvement values are correct for the gate input of a previously proposed GDI cell, for which the previously proposed GDI cell characteristics are similar to those of the CMOS inverter. If the diffusion input is considered, an additional factor is applied to represent the capacitance ratio between the gate and diffusion inputs, and the factors given above are multiplied by CGate/CDiff. Both capacitance parameters are defined by the design technology.
Previously proposed GDI cell fan-in analysis is based on the structural similarity of previously proposed GDI and complementary CMOS logic gates. As shown below, an (n+2)-input previously proposed GDI cell can be implemented by the extension of any n-input CMOS structure. While the stack of serial MOSFET devices and in CMOS gate fan-in are limited by body-effect considerations, the addition of the diffusion inputs (i.e. the dedicated logic terminals) for a previously proposed GDI gate with the same structure results in improved fan-in, given by:Fan-inGDI=Fan-inCMOIS+2  (31)
Note that for the F1 and F2 functions, where only one additional dedicated diffusion input is used, the fan-in increases by 1 relative to CMOS.
In summary, the GDI logic cell shows improvement over comparable CMOS logic in terms of delay, number of transistors, area, and power consumption. GDI logic circuits, however, have certain drawbacks, which are primarily related to input connections to MOSFET wells. Firstly, GDI logic circuits may experience a threshold drop, and, in some cases, an increased diffusion input capacitance. Both effects exist in PTL techniques as well, and were considered in the simulations and analysis presented herein. Secondly, there is a relative increase of circuit area due to separated MOSFET wells (comparisons based on actual logic gate layouts are presented below).
The previously proposed GDI cell shown in FIG. 2 has a connection between the two common logic terminals connection. Reference is now made to FIG. 8, which is a circuit diagram of a logic circuit having separate common logic terminals. The logic cell of FIG. 8 is designated herein as a double-gate-input previously proposed GDI cell (prior-art GDI*). The previously proposed GDI* logic cell 800 has two transistor networks, p-type networks 810 and n-type transistor network 820, which each contain a single transistor. The previously proposed GDI* cell has two logic input terminals, I (830.1) and I* (830.2), which are connected to the gates of the PMOS and NMOS transistors respectively, a first dedicated logic terminal (P) 840 at the outer diffusion node of the PMOS transistor, and a second dedicated logic terminal (N) 850 at the outer diffusion node of the NMOS transistor 820. The common diffusion logic terminal (D) 850 is connected to the drains of both transistors. As shown in FIG. 8, in the previously proposed GDI* logic cell there is a separate input to each gate, I and I′, instead of a common input to the gates of both p-type and n-type transistors as in FIG. 2. For proper operation, the common logic inputs, I and I′, are provided with mutually exclusive signals. Ensuring that the input signals are mutually exclusive can be achieved by an appropriate circuit environment, as in a previously proposed GDI-latch, or by applying an inverter to one of the inputs.
Reference is now made to FIG. 9, which shows the structure of a latch based upon the previously proposed GDI* cell of FIG. 8. The latch consists of two previously proposed GDI* cells, 910 and 920, and inverter 930, with logic inputs at logic terminals 920.1 and 920.2 respectively. The logic output is at the common diffusion terminal 920.5 of previously proposed GDI* cell 920. The two cells are connected by inverter 930, through which the common diffusion outputs, 910.5 and 920.5, of the two cells are connected. The two dedicated logic terminals, 920.3 and 920.4, of previously proposed GDI* cell 920 are respectively connected to logic inputs 910.1 and 910.2 of the previously proposed GDI* cell 910. Dedicated logic terminals, 910.3 and 910.4, of previously proposed GDI* cell 910 are tied to VDD and ground respectively.
In the previously proposed GDI* latch an inverter is used to obtain in-circuit swing restoration. Table 3 shows the performance of the previously proposed GDI* latch.
TABLE 3ABQ00no change01Q′10no change11no change
Reference is now made to FIGS. 10a-10e, which are simplified diagrams of previously proposed GDI latches. FIG. 10a shows a T-latch based upon the previously proposed GDI* latch of FIG. 9. T-Latch 1000 consists of a previously proposed GDI flip-flop 1012 and inverter 1014. The logic signal is input at terminal T 1013, and is fed through inverter 1014 to input A 1015 of TFF 1000, and directly to input B 1016 of flip-flop 1012. The inputs of the T-Latch are connected through inverter 1014, so that an efficient 8-transistors implementation is achieved.
Reference is now made to FIG. 10b, which shows a T-latch 1020 based on the standard previously proposed GDI cell. FIG. 10b is a circuit diagram of a previously proposed GDI T-latch. T-latch 1020 consists of previously proposed GDI cell 1030, and three inverters 1041 to 1043. The logic signal is input to the common logic input (G) of previously proposed GDI cell 1030. The output at the common diffusion terminal (D) of previously proposed GDI cell 1030 is connected to the T-Latch output Q via inverter 1043. Inverters 1041 and 1042 feed back the output signal to the dedicated logic terminals (P and N) of previously proposed GDI cell 1030. Note that in FIG. 10b inverters INV2 1042 and INV3 1043 are added for swing restoration and can be eliminated in zero-VTH technologies. In any case the implementation is effective, and more compact than CMOS alternatives. The presented circuit can be extended to TFF by adding an edge detector circuit containing two previously proposed GDI cells (NOT and AND).
Three previously proposed GDI D latches are shown FIGS. 10c, 10d, and 10e. Reference is now made to FIG. 10c which shows the structure of a previously proposed GDI F1-based D-latch 1050. This circuit is compatible for implementation in standard CMOS technology. D-latch 1050 consists of two previously proposed GDI cells, 1060 and 1062, AND gates, 1070 and 1072, and inverter 1074. The common diffusion terminal of previously proposed GDI cell 1060 is connected to the common logic input of previously proposed GDI cell 1062. The D and CLK latch inputs are connected via AND gates 1070 and 1072, and inverter 1074 to the first dedicated logic terminals of the previously proposed GDI cells, 1060 and 1062. The second dedicated logic terminals of the previously proposed GDI cells, 1060 and 1062, are tied to ground.
Reference is now made to FIG. 10d which shows the structure of a previously proposed GDI F2-based D-latch 1070. D-latch 1070 is structured similarly to D-latch 1050 of FIG. 10c, but has the AND gate outputs connected to the second dedicated logic terminals of the two previously proposed GDI cells, and the first dedicated logic terminals tied high.
Reference is now made to FIG. 10e which shows the structure of a previously proposed GDI D-Latch based on previously proposed GDI cells. D-latch 1090 consists of two previously proposed GDI cells, 1092 and 1093, and inverters, 1094 and 1095. Inverter 1094 is connected between the common diffusion output of previously proposed GDI cell 1093 and the second dedicated logic terminal of previously proposed GDI cell 1092. Inverter 1095 is connected between the common diffusion terminal of previously proposed GDI cell 1092 and the second dedicated logic terminal of previously proposed GDI cell 1093. The D-latch inputs and outputs are at the first dedicated logic terminals of the two previously proposed GDI cells, 1092 and 1093, and the inverter inputs. Note that D-latch 1050 and D-latch 1080 latch on the falling edge of the clock, and that D-latch 1090 latches on the rising edge of the clock. The edge used to latch the data is selected by the circuit designer by providing the proper logic at the clock input.
FIGS. 2-10 are based on a dual-transistor previously proposed GDI (or previously proposed GDI*) logic cell, which has a single transistor in each of the two logic blocks. The multi-transistor previously proposed GDI logic circuit, each logic block contains a transistor network composed of multiple transistors. The logic blocks may have more than one common logic input, where each additional common logic terminal is connected to the gates of complementary transistors in both of the transistor networks.
Table 1 lists the various logic functions which can be provided by a single previously proposed GDI cell. The previously proposed GDI cell is an extension of a single-input CMOS inverter structure a triple-input logic structure. The two additional inputs of the previously proposed GDI cell are provided by the first and second dedicated logic terminals, which in the CMOS cell do not serve as logic terminals but instead are tied to a fixed voltage.
Reference is now made to FIG. 11 which is a simplified block diagram of a comparison between an n-input CMOS logic gate and an (n+2)-input previously proposed GDI logic circuit. Previously proposed GDI circuit 1100 consists of two n-input logic blocks, 1110 and 1120, with additional logic inputs at the P and N terminals, yielding a total of n+2 logic inputs. CMOS circuit 1140 is similarly composed of two n-input logic blocks, 1150 and 1160, however the P and N terminals are tied to VDD and VSS respectively, and do not serve as logic terminals. Extension of any n-input CMOS structure to an (n+2)-input previously proposed GDI cell can be done by introducing a logic input at the first dedicated logic terminal (P) of the PMOS block 1110 (instead of the supply voltage VDD), and a second logic input at the second dedicated logic terminal (N) in the NMOS block 1120 (instead of VSS). A previously proposed GDI circuit having more than one transistor in the P and N logic blocks, 1120 and 1130, is designated herein as a multi-transistor GDI circuit. (A comparable extension can be made to any complementary transistor structure, and is not limited to CMOS.)
Previously proposed GDI circuit implementations can be represented by the following logic expression:Out= F(x1 . . . xn)P+F(x1 . . . xn)N  (32)where F(x1 . . . xn) is the logic function of the n-MOS block (not of the whole original n-input CMOS structure). An example of such an extension can be seen in FIG. 12, which shows a previously proposed GDI circuit 1200, having logic blocks 1210 and 1220, consisting of triple-input transistor networks (inputs A, B, and C). The two logic blocks implementing complementary logic functions. Since the P and N terminals previously proposed GDI logic circuit 1200 serve as logic inputs, there are five logic terminals in all. A complementary CMOS logic circuit having the same structure would have only three logic inputs (A, B, and C).
The expression in equation (32) can be used to implement a Shannon expansion (see E. Shannon, W. Weaver, “The Mathematical Theory of Information”, University of Illinois Press, Urbana—Champaign, Ill., 1969, contents of which are hereby incorporated by reference). A function Z with inputs {x1, . . . , xn} can be expanded as:Z(x1 . . . xn)=H(x2 . . . xn)x1+J(x2 . . . xn) x1  (33)where the functions H and J are:H=Z|x1=1,J=Z|x1=0  (34)
Shannon expansion is a very useful technique for precomputation-based low-power design of sequential logic circuits due to its multiplexing properties (see M. Alidina, J. Monteiro, S. Devadas, A. Ghosh, and M. Papaefthymiou, “Precomputation-Based Sequential Logic Optimization for Low Power”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 2, no. 4, pp. 426-435, December 1994), contents of which are hereby incorporated by reference. In multiplexer-based precomputation, input X1 can be used as an enable line for the H and J functions, and as the select line of a multiplexer which chooses between the data of the H and J functions. For a given value of X1 only one of the H or J blocks will operate, significantly reducing the power dissipation of the circuit.
Reference is now made to FIG. 13, which is a simplified block diagram of an extended previously proposed GDI circuit. The previously proposed GDI architecture illustrated in FIG. 13 is based on equation (32). Extended previously proposed GDI circuit 1300 consists of an n-input switching block 1330 (which may be either a previously proposed GDI cell or a multi-transistor previously proposed GDI circuit). Further logic inputs are provided to logic gates 1310 and 1320. The logic output of logic gate 1310 is connected to the first dedicated input of switching block 1330, and the logic output of logic gate 1320 is connected to the second dedicated input of switching block 1330. Extended previously proposed GDI circuit 1300 operates essentially as a multiplexer, selecting between logic gate A 1310 and logic gate B 1310. Logic gates 1310 and 1320 implement functions A(Xn+1 . . . Xp) and B(Xp+1 . . . Xr) respectively, in any technologically compatible manner. Switching block 1330 connects between the logic gates and the following logic block C 1340. Depending on the value of F(x1 . . . xn), only one of the functions will drive the data computed as a result of its input transitions, while the data transitions from the other function are prevented from propagating to the next logic block C.
The previously proposed GDI logic circuits (i.e. previously proposed GDI cell, previously proposed GDI* cell, multi-transistor previously proposed GDI circuit, and extended previously proposed GDI circuit) described above can serve as building blocks for more complex logic circuits. The applicability of the Shannon expansion (33 and 34) to any logic function, allows a previously proposed GDI implementation of any digital circuit, thereby achieving a low power implementation of the logic function. Due to their special properties, previously proposed GDI logic circuits can be used for design of low-power combinatorial circuits. Two or more previously proposed GDI logic circuits are interconnected to form a higher order previously proposed GDI logic circuit. Several higher order logic circuits composed of interconnected previously proposed GDI logic cells are given below, along with performance data.
A method for the design of combinatorial logic circuits consisting of interlinked previously proposed GDI cells is now presented. The combinatorial circuit design combines two approaches: (1) Shannon expansion and (2) combinational logic pre-computation, where transitions of logic values are prevented from propagating through the circuit if the final result does not change as a result of those transitions. Previously proposed GDI logic circuits can be realized using only the standard previously proposed GDI cell. This is in contrast to PTL-based logic, which has no simple and universal cell library available. The development of circuit synthesis tools for PTL is consequently problematic.
The design of previously proposed GDI logic circuits is based on Shannon expansion (27), where any function F can be written as follows:
                                                                        F                ⁡                                  (                                                            x                      1                                        ⁢                    …                    ⁢                                                                                  ⁢                                          x                      n                                                        )                                            =                                                                                          x                      1                                        ⁢                                          H                      ⁡                                              (                                                                              x                            2                                                    ⁢                                                                                                          ⁢                          …                          ⁢                                                                                                          ⁢                                                      x                            n                                                                          )                                                                              +                                                                                    x                        1                                            _                                        ⁢                    G                    ⁢                                          (                                                                        x                          2                                                ⁢                                                                                                  ⁢                        …                        ⁢                                                                                                  ⁢                                                  x                          n                                                                    )                                                                      =                                                                                        =                                                                    x                    1                                    ⁢                                      F                    ⁡                                          (                                              1                        ,                                                                              x                            2                                                    ⁢                                                                                                          ⁢                          …                          ⁢                                                                                                          ⁢                                                      x                            n                                                                                              )                                                                      +                                                                            x                      1                                        _                                    ⁢                                      F                    ⁡                                          (                                              0                        ,                                                                              x                            2                                                    ⁢                                                                                                          ⁢                          …                          ⁢                                                                                                          ⁢                                                      x                            n                                                                                              )                                                                                                                              (        35        )            As shown above, the output function of a previously proposed GDI cell (where A, B and C are inputs to G, P and N respectively) is:Out=AC+ĀB  (36)The similarity of form between equations (35) and (36), makes the standard previously proposed GDI cell suitable for implementation of any logic function, which can be written by Shannon expansion. Thus:If A=x1,C=F(1,x1 . . . xn),B=F(0,x1 . . . xn) thenOut=F(x1 . . . xn)=x1F(1,x2 . . . xn)+ x1F(0,x2 . . . xn)  (37)
Reference is now made to FIG. 14 which is a simplified flowchart of a recursive algorithm for implementing logic functions by previously proposed GDI cells. The algorithm synthesizes any combinatorial function by means of 3-input previously proposed GDI cells. The algorithm's steps may be summarized as follows:
Given a function F with n variables:                Step 1400 Check, if function F is equal to 1, 0 or a non-inverted single variable.        Step 1410 If F is equal, provide a connection to a high logic signal, a connection to a low logic signal, or a logic input.        Step 1420 If F is not equal, expand F into two functions H and J according to the Shannon expansion (35) of F for a selected variable Xn.        Step 1430 Go to step 1400 to find previously proposed GDI implementation for both H and G.        Step 1440 Use a previously proposed GDI cell MUX for F function implementation, with variable Xn at common input, and the H and J implementations each connected to a separate dedicated logic terminal.The algorithm of FIG. 14 can also be expressed in pseudo-code as follows, where G(d1,g,d2)=not(g)*d1+g*d2:        
Algorithm SyntGDI(f,n)                If (f==1) then return(‘1’)                    else if (f==0) then return(‘0’)                            else return(G(SyntGDI(f|xn=1),xn, SyntGDI(f|xn=0)));                                                
As an example, if F(x1,x2,x3)=XOR(x1,x2,x3), the above procedure returns:
NG(G(NG(0,x3,1),x2,NG(1,x3,0)),x1,G(NG(1,x3,0),x2,NG(0,x3,1)))
where ‘G’ stands for previously proposed GDI and ‘NG’ for an inverted previously proposed GDI cell that is inserted post-process in order to maintain signal integrity. This approach can be used in combination with existing cell library-based synthesis tools to achieve an optimized design.
Reference is now made to FIG. 15, which is a simplified flowchart of a method for designing a logic circuit. FIG. 15 presents the method of FIG. 14 in more detail, but essentially involves the same recursion, to progressively simplify the logic function. Each recursion reduces the number of function variables by one, until eventually the required function can be represented as an interconnected network of simple previously proposed GDI multiplexing cells. Once a single variable representation has been reached, the recursion cycles end, combining the previously proposed GDI cells into a structure that performs the specified logic function. The method thus provides a logic circuit design consisting of interconnected previously proposed GDI logic cells. The logic cells are dual-transistor previously proposed GDI cells, as shown in FIG. 2.
In step 1500 a logic function having at least one logic variable is received. The logic function to be synthesized, F, is set equal to the received logic function in step 1510. The synthesis recursion cycle begins at step 1515. In step 1520 the synthesized function is checked to determine if it is a non-inverted single logic variable X. If so, a connection for a logic input is provided in step 1525. The synthesis recursion cycle is then discontinued.
In step 1530 the synthesized function is checked to determine if it is a high logic level. If so, a logic design consisting of a connection to a high logic level is provided in step 1535. The synthesis recursion cycle is then discontinued.
In step 1540 the synthesized function is checked to determine if it is a low logic level. If so, a logic design consisting of a connection to a low logic level is provided in step 1545. The synthesis recursion cycle is then discontinued.
If the logic function being synthesized is not equal to either a high, low, or non-inverted logic variable, a Shannon expansion of F is performed to reduce the number of logic variables by one. In step 1550 a first logic function H, a second logic function J are extracted from a Shannon expansion of the synthesized function for a selected logic variable Xn. A recursion cycle is then performed for each of the extracted functions, to obtain a circuit design for functions H and J.
The recursion cycle for function H involves setting the synthesized function to H in step 1560, and entering a new recursion cycle at step 1515. When the recursion ends, a sub-circuit design of interconnected previously proposed GDI cells is provided for function H.
Next a recursion cycle for function J is performed. In step 1570 the synthesized function is set to Z, and a new recursion cycle is entered at step 1515. When the recursion ends, a sub-circuit design of interconnected previously proposed GDI cells is provided for function J.
In step 1580 the sub-circuit designs obtained for functions H and J are combined using a previously proposed GDI cell. A final logic circuit design is provided consisting of a logic element with the selected logic variable at the common logic terminal G, the output of the first sub-circuit connected to the first dedicated logic terminal P, and the output of the second sub-circuit connected to a second dedicated logic terminal N. The logic circuit output is at the logic element common diffusion terminal. The synthesis recursion cycle then ends.
The Shannon expansion of the logic function being synthesized is performed in step 1550. Reference is now made to FIG. 16, which is a simplified flowchart of a method for extracting the first and second logic functions (H and J) from the synthesized function. In step 1600, H is extracted from F by setting the selected variable to High, that is H=F{X1 . . . Xm|Xn=1}. In step 1610, J is extracted from F by setting the selected variable to Low, that is J=F{X1 . . . Xm|Xn=0}.
The previously proposed circuit design method includes the further step of inserting buffers into the logic circuit design. An analysis was presented above to determine the maximum number of previously proposed GDI cells which can be cascaded without requiring a buffer to stabilize signal levels. Equations (14) and (25) are used to calculate the values of N1 and N2, and the maximal number of stages which can be cascaded between two buffers equals the minimal value between N1 and N2. N1 and N2 depend on process parameters, frequency demand, and output loads. For example, given a 0.35 um technology process (with VTH=0.5V), a frequency demand of 40 MHz, and a load capacitance of 100 fF, the maximal number of stages is dictated by equation (14), where N1 is calculated with Vdrop=VTH. The resulting value indicates that a buffer is required after every two cascaded previously proposed GDI cells. Buffer elements are inserted between previously proposed GDI cells to prevent the occurrence of chains that exceed a specified length. The buffer elements may consist of one or more inverters.
Reference is now made to FIG. 17, which is a simplified flowchart of a method for providing a previously proposed GDI logic circuit. In step 1700 a previously proposed GDI logic circuit is designed for a specified function by the method of FIG. 15. In step 1710 the required previously proposed GDI cells are provided, and in step 1720 the previously proposed GDI cells are connected as specified by the circuit design.
One advantage of the abovedescribed methods is the ability to calculate the maximal number of transistors needed for implementation of an n-input function, before the actual logic circuit design. The maximal number of transistors is calculated as:M=2·2n-1=2·N=2n  (38)where M is the maximal number of transistors that are needed to implement the function, N is the maximal count of previously proposed GDI cells and n is the number of variables in the given function. Knowledge of the maximal number of previously proposed GDI cells required firmly determines the final maximal area of the circuit.
Using the Shannon expansion in regular logic circuits results in reduced power dissipation but requires significant area overhead. The area overhead is caused by the additional precomputation circuitry that is required. The Shannon-based previously proposed GDI design does not require a special precomputation circuitry because of the MUX-like nature of the previously proposed GDI cell, so that most area overhead is eliminated.