Flip-flops are the general-purpose data storage element used in digital electronic circuitry. Flip-flops are important to the design of digital circuits, because they are the general-purpose clocked storage elements that make sequential and state logic design feasible. Some of the uses of flip-flops include storage of logic states, parameters, and digital control signals. Microprocessors, for example, typically contain thousands of flip-flops. A few well-known types of flip-flops include D, set, reset, set-reset, JK, toggle enable, and scan type flip-flops.
Because flip-flops may affect the integrated circuits in which they are used, it is highly desirable to improve their design and performance. Flip-flops may affect the integrated circuits in which they are used in the following ways:
1. The switching speed of flip-flops is a fundamental limiting factor of logic circuits. Flip-flop setup and hold times along with clock-to-output times are fundamental limits in setting the maximum logic clocking speed. Because the setup and hold times to store a logic 0 value or a logic 1 value are different, it is generally desirable to minimize the difference in order to reduce the overall switching time of the flip-flop. PA1 2. Flip-flops are used to set the basic design speed of an integrated circuit cell library from which digital circuits are made. The flip-flop maximum toggle rate defines the maximum clock frequency of the library. PA1 3. Flip-flops define the speed and phase noise of digital phase locked loops; PA1 4. The time gap between the latest usable setup time and the subsequent earliest hold time defines a metastable window. Reducing the length of this metastable window improves the performance of state logic and other synchronous applications. PA1 5. The flip-flop layout configuration defines the cell "height" (rail-to-rail distance) of an entire integrated circuit cell library. An asynchronously resetable edge-triggered data flip-flop is perhaps the most often used large cell in a digital library. Reducing the digital library cell height, as determined by the flip-flop height, directly reduces the chip area and results in less interconnect parasitic effects. PA1 6. The transient power consumption of a flip-flop is instrumental in setting the width of the power busses required in the cell library's layout so that adequate power can be supplied for a given transient voltage drop. PA1 7. The energy a flip-flop consumes during toggling and the load it places on the clock input line is a significant contributor to the overall circuit power dissipation. PA1 8. The flip-flop switching speed defines the time window in which transient current passes through its complementary switching devices. Faster switching produces less pass through charge for lower power operation. Low activity within the flip-flop when it is clocked but not toggled also reduces power consumption. PA1 9. Race conditions during flip-flop toggling add to the pass-through current. Eliminating the race tends to eliminate the current component. PA1 10. Switched capacitance internal to the flip-flop is a major transient current component. A flip-flop in which switched capacitance is minimized has less switching current while achieving high switching speeds. PA1 11. The ratio of the transistor switching strength to the amount of the switched parasitic capacitance determines the flip-flop's internal speed. PA1 12. A small number of series gate delays from the data (D) input of a flip-flop to the output (Q) is desirable for fast setup and hold times. A small number of series gate delays from clock (CK) to output (Q) provides fast flip-flop response time. PA1 13. A balance in the delay paths from the data (D) and clock (CK) inputs to the output (Q) reduces the asymmetric delay times. The difference between the positive and negative going response times should be included in the flip-flop's switching time specification. It also biases the probability the switching response to random inputs for circuits that synchronize random signals. PA1 14. A small number of series devices driving the output path, especially weaker p-channel devices, increases the output drive and thus reduces transition time. PA1 15. A minimum of two series transistors is required to implement a logic function. By using this number of series transistors as a maximum in a flip-flop, its power, speed, area, and low voltage performance are improved. PA1 16. The low voltage performance of a flip-flop generally defines the minimum operating voltage of logic circuitry. This not only allows low voltage operation, but greatly saves power dissipation by a square law given by P.about.V.sup.2. PA1 17. The static power dissipation of ultra low power integrated circuits is from the "off" state leakage current of the metal-oxide semiconductor (MOS) transistor leakage currents as well as the MOS diffusion areas. It is desirable to minimize these parameters. PA1 18. Advanced flip-flop configurations can simplify the logic that is connected to them. This extends the logic circuit operation and functionality and reduces the delay and total area consumed. PA1 1. Propagation delay from any input to the output is one gate delay instead of multiple gate delays. This results in faster gates, although the more complicated complex-gate delay is somewhat slower than a single gate delay. PA1 2. Equalization of propagation delays--There is only one complex-gate propagation delay from any input to the output. Normal gate combinations of the same logic function have a variable number of gate delays from different inputs to the final output. However, different inputs can have different output drive strengths if the individual complex-gate transistors are not sized to compensate for this. PA1 3. Elimination of node bounce as logic signals propagate through the levels of logic in the array of gates that are being replaced by a complex-gate--Temporary intermediate logic states exist from propagation delays through the array of gates. By eliminating the nodes within the array of gates through conversion to a complex-gate, there are no nodes to bounce. This technique lowers the power consumption from the additional nodes, especially when the lack of gate output bouncing is realized. PA1 4. Lower power consumption and faster speed due to only one output node and no internal nodes of the complex-gate. PA1 5. Lower power consumption, smaller area, and faster speed offered by fewer switching devices within the complex-gate. PA1 6. Lower power consumption and faster speed offered by less internal interconnect within the complex-gate--Strap connections that form internal nodes by connecting the n- and p-channel devices together are eliminated. PA1 7. Tighter gate structure layout--This is particularly advantageous in newer technologies. They offer many levels of interconnect to get logic signals into the complex-gate. These newer technologies define the logic cell area primarily by the active area on which the gates are formed, since the internal interconnect within the gate layout is above, in the multiple-level metal interconnect. In the older one- and two-metal level integrated circuit processes, the use of large complex-gates was restricted by the routing congestion of a high concentration of input wires into the complex-gates. With numerous levels of routing interconnect, this restriction is eliminated, and the advantages of complex-gates can be fully realized. PA1 8. In some newer technologies such as silicon on insulator (SOI), the spacing between complementary devices is eliminated, since there are no wells for isolation. Complex-gates can take better advantage of this to reduce the cell area and internal cell interconnect.
For these and other reasons, it is desirable to have improved flip-flop configurations and design techniques and related digital logic circuitry.
Flip-flops are generally made of latches. Latches typically form the master or slave half of an edge-triggered flip-flop, or both. Thus, a flip-flop is often constructed from a master latch and a slave latch, in which the output of the master latch is the input of the slave latch, and the output of the slave latch provides the output of the flip-flop. Instead of being edge-sensitive to the clock control input, latches are level sensitive to a clock equivalent control input customarily called "enable." When the enable control signal is active, the latch accepts the logic-input signal on the data line. During this time, the data input signal is passed through to the output Q, which is known as the pass-through state of the latch. When the enable control signal is in the inactive condition, the data input line is locked out of the latch, and the Q output reflects the logic state contained in the latch at the time the enable signal was taken low. Latches have a similar impact as flip-flops on the integrated circuits in which they are used. They are often used in an array such as a register file, where they have a special data path layout that shares resources. Special design and layout considerations enhance their use in this.
FIG. 1 shows a very simple form of a static latch cell, which is a pair of cross-coupled inverters. Overdriving the latch outputs, using additional transistors, performs the set and reset control. This approach with address selection transistors can be used to form static Random Access Memory (RAM) cells. FIG. 2 shows a Set-Reset latch formed by replacing the inverters of FIG. 1 with NOR logic gates. Replacing the inverters with NAND logic gates forms an active-low SetN-ResetN cross-coupled latch. Note that with respect to the Q and Qn outputs, the NAND gate SetN and ResetN inputs are on the opposite gates from the NOR gate Set-Reset latch.
The design of the flip-flop is the fundamental starting point of an integrated circuit library. First, the desired speed/power-consumption tradeoff is chosen. The flip-flop is then designed to meet this criterion using an estimated output loading. This output loading is based on the routing complexity and the expected integrated circuit core size. The proportions of the flip-flop and other cell sizes, which are being designed in this process, set this in turn. Iterative processes of estimation, simulation, and back-annotation are used to arrive at the solution. Through the cell library design, the speed and power consumption are set, and thus, the speed and power performance of the library are set along with the entire integrated circuit. Thus, the flip-flop is the dominant factor in the size and performance of a digital or mixed mode integrated circuit.
Interconnect parasitic effects have become more dominant as integrated circuit process feature size has decreased. The pitch is closer, but the interconnect is becoming relatively thicker to keep cross sectional resistance low as required for the high speed. This combination greatly increases the internodal parasitic capacitance. Many layers of metal for interconnect and power distribution are commonplace making this interconnect parasitic loading the real limiting factor. Hence, reduced chip size is a very desirable commodity. The goal is to make the library more dense, and density of the flip-flop is therefore a key to accomplishing this, since the flip-flop sets the cell row pitch of the library. The goal is to use of the entire occupied chip area for compact active area and to minimize employment of chip area used just for interconnections inside or outside of the cells. Cells that are rarely used should be kept at the same cell height by making them wider to accomplish interconnection. Shared active area power between cells increases density and their use can be incorporated into routers.
When the two clock phases of the sequential latches in a flip-flop get too close together, there can be a critical "race" between the master latch and the slave latch that is produced by the data and the clock that controls them. Suppose that the master latch is in the mode of holding the flip-flop-input data acquired from the previous clock phase. In this mode, the slave latch is in its transparent (or pass-through) mode. This means that the slave latch passes the data being held by the master latch through to the flip-flop's Q output. When the clock state is reversed, the master latch switches from its hold mode to its acquire new data (or pass-through) mode. At the same time the slave latch changes from acquiring the master latch's output data to its hold mode. The slave latch must switch to its hold mode first or the flip-flop's Q output can change state here in the middle of its cycle. In other words, the slave latch must switch to its hold mode before the master latch switches to its sample mode and passes new data through to the slave latch Q output.
Various approaches can be used to control this race condition. For example, separate non-overlapping clock signals can be used to separate the clocking times of the master and slave latches from each other. This is not normally practical due to the extra interconnect and signal generation required, as well as the extra time it takes to guarantee its operation in its worst case conditions. Alternatively, flip-flops can be designed with an internal speed bias to drive this race condition to the correct direction. This bias must be guaranteed to produce the correct results in the worst-case conditions, including slow clock transitions and minimum operating power supply voltage. It often has to operate correctly as a battery is depleted. Third, the logic within the flip-flop can be designed to eliminate the race. This may be the most desirable approach. Here, the flip-flop's internal logic steps through two sequential states. The first state places the slave latch into the hold mode. Then, from this logic state, proceed to another logic state that switches the master latch to its acquire mode. This sequential state operation guarantees that this critical race is avoided. In other words, the master output signal, where the race occurs, is preventing activating the slave by gating it with the clock. This type of design has been referred to as a "race-free" flip-flop. It is desirable to have improved race-free flip-flop designs.
Low voltage performance is an important feature of flip-flops. As process dimensions are reduced, the physical dimensions that separate two voltages decrease. The gate oxide thickness is decreased along with the active area dimensions making up transistors. Accordingly, the electric field approaches the dielectric breakdown limit of the SiO.sub.2 gate insulator between gate and drain. To avoid breakdown, the power supply voltage must be limited. For current technology, this scaling means that for a 0.1-micron source-drain spacing, the physical voltage that the transistor can tolerate is limited to 1.0 volt. In order to switch quickly and efficiently, CMOS transistor thresholds are normally set to be one quarter of the power supply voltage to provide a n-channel threshold, a p-channel threshold, plus an additional amount of voltage to guarantee high saturated drive of these transistors during their active switching operation. If the power supply voltage is lowered below one volt, not only is there insufficient voltage for circuit headroom, but there is not even enough voltage to fully turn the transistors on.
One possible approach is to lower the threshold voltages by shifting the device characteristics in voltage for lower thresholds. This results in transistors that do not fully turn off. The MOS drain current around the off state is exponentially related to the gate voltage. Here, the MOS device is in the weak inversion region of operation. To decrease the voltage distance between off and on transistor operation, the slope (or gain) of the transistors must be increased. Higher gain MOS devices have always been a device design goal, so that approach will not likely be fruitful. Some processes, such as Silicon-On-Insulator (SOI), can increase the weak inversion slope factor (gain), which lowers the off state leakage current, but device designs that address this are not presently known. Only minor effects can be made to increase the slope of the turn-on curve, such as the back gate from the well body below the device. The outcome is that circuits that perform better at low voltage are extremely important. Flip-flops can also be subject to internal race conditions. In particular, note that the output of an edge-triggered flip-flop is not valid when there is a transition of the output logic states. As mentioned above, flip-flops are typically made up of two latches, e.g., a master latch and a slave latch. The master latch is used to sample the input data signal. The slave latch is used to hold the output so that it is valid at all times except for logic transitions. In order to accomplish this, the two latches are clocked out of phase from each other. Two non-overlapping clock signals need to be used for this, one for the master and one for the slave. These non-overlapping clock signals have the disadvantage of requiring generation and distribution of two clock signals, along with their complements. Their worst-case timing tolerances limit the maximum clocking rate and thus the flip-flops maximum useful speed, not to mention the relatively large area, power, and complexity incurred in achieving this.
Another important type of device commonly used in electronic circuitry is the complex-gate. Complex-gates are device-level simplifications of combinations of logic gates used to derive a logic output function. Complex-gates reduce the number of switching devices and internal nodes within the gate. FIGS. 3D and 3F show an example of a complex-gate, representing a selector (multiplexer). In particular, FIG. 3D shows a logic gate representation and FIG. 3F shows a schematic representation, in which the intermediate nodes are removed. FIGS. 3A, 3B and 3C show a graphical reduction from the standard logic into a complex-gate. FIG. 3E is a schematic diagram of the selector for the non-complex-gate implementation.
Nodes within the complex-gate which appear in the logic diagram and are not used as outputs are often eliminated. The advantages of eliminating complex-gate internal nodes include the following:
Complex-gates have not been widely used in the past, because the advantage of reduced cell layout area is often offset by the congestion of signals routed to the complex-gate when the cell is used. An array of gates distribute interconnect, preventing this. However, when multiple levels of metal interconnect are considered, complex-gates become more attractive. This is especially true when it is realized that the active area used by the gates is greatly reduced by the use of a complex-gate. In addition, a high density of cell I/O pins placed within a multi-level metal complex-gate cell does not necessarily increase the cell area. Previously, each complex-gate input needed a Metal-2 track width to enter the cell in a two-metal system. Because of this, most integrated circuit logic cell libraries do not contain many and larger complex-gates. Thus, the art of complex-gates has not been well developed.