1. Field of the Invention
This invention generally relates to electronic circuitry and, more particularly, to a two-gate delay flip-flop with a shadow stage, which is immune from corruption, to support the latch in its opaque (hold) phase.
2. Description of the Related Art
Whether it is a phase-based design implemented with latches as a fundamental memory element, or edge-based design implemented with back-to-back latches (flip-flops) as the fundamental memory element, latches are an essential building block in modern very large scale integration (VLSI) designs. With conflicting properties of delay, area, power, and robustness, it is difficult to design latches that satisfy all design requirements.
Latches are commonly used in VLSI designs either by themselves or as part of an edge-triggered Flip-Flop (FF) due to their memory holding function. A latch has two phases of operation: in the transparent phase, data flows freely from D to Q, and the amount of time for this to occur is its native delay (Tdq). In the opaque phase, data may toggle on the input D, but Q holds its previous value. Which phase the latch operates in is determined by the phase of the clock input. In the context of being a FF building block, there are setup time (Tsu) and clock delay (Tcq) characteristics of the FF. However, those two parameters together form the Tdq native delay, and it is useful to discuss this value as the metric for performance.
FIGS. 1A and 1B are schematic diagrams of a pass gate (prior art). As configured, when the clock signal (CLK) is low, the device is in a tri-state mode, meaning the output impedance is high. When CLK is high, the input signal (D) is passed to the output. Alternatively, the CLK signal can be connected to the gate of the PMOS transistor and the inverted CLK signal (CK1) connected to the gate of the NMOS transistor, in which case the input is passed when. CLK is low. The device of FIG. 1A may also be depicted as shown in FIG. 1B.
FIG. 2 is a schematic diagram of a conventional latch design using pass gates (prior art). The latch is a clocked state element (from D to D1), which is protected from the output by an inverter (from D1 to Q). This design has the benefit of being simple to understand and extremely robust. However, it has two gate-delay elements, which limits its performance. Note: CLK and CK1 are opposite phases of a binary clock signal.
FIG. 3 is a schematic diagram of a conventional latch design with improved gate delay (prior art). As an alternative to the design of FIG. 2, the output inverter is removed to provide a faster Tdq. However, this design has a major flaw in that the memory state element is exposed to the output. The memory state is the output value maintained by pass gate 302, when pass gate 300 is in its opaque phase. When the memory state is protected by the inverter, as in FIG. 2, the effects of any external coupling effects are minimized. When the memory state is exposed, as in FIG. 3, uncontrolled external routes and coupling events can directly affect the feedback loop's ability to maintain the state. If this happens, the memory state becomes corrupted and irrecoverable.
FIG. 4 is a timing diagram contrasting the differences in delay between the circuits of FIG. 2 and FIG. 3. In terms of gate delay, the right-most figure, associated with the latch of FIG. 3, is one gate faster than the left-most figure, which is associated with the latch of FIG. 2.
FIG. 5 is a timing diagram depicting the differences in memory state corruption between the circuits of FIG. 2 and FIG. 3. The diagram illustrates a glitch event from external routing upon the output pin Q. If the state-node is exposed as in FIG. 3, an external aggressor net “Agg” can potentially flip the state of the latch (left-most figure). In the design of FIG. 2, a noise event can only produce a glitch on Q instead of flipping the state of the latch (right-most figure).
FIG. 6 is a schematic drawing of a conventional edge-triggered flip-flop using pass gates (prior art). Edge-triggered flip-flops are commonly used in high-performance synchronous designs due to their robustness and ease of use. A FF is made up of two latches, conventionally described as master and slave latches. As shown, each latch is based upon the design depicted in FIG. 2. Each of these latches is transparent in alternating clock phases, and this creates the functionality of a FF. The delay characteristics of a FF are described by its delay through the master latch (Tsu), the delay through the slave latch (Tcq) and hold time (Thd). Of the three characteristics, Tsu and Tcq are sometimes combined as the total FF delay (Tdq) to describe the overall delay characteristic of the FF.
The key elements of the flip-flop are its master latch state nodes (MS) and its slave latch state nodes (SS). The state nodes of latches are made up of clocked cross-coupled pass gates to provide a feedback loop. This feedback loop maintains the state of this memory element when the latch is opaque. Therefore, these state nodes must be carefully designed to prevent any noise related glitch event from corrupting the state of the latch.
FIG. 19 is a schematic drawing of a positive edge-triggered true single-phase clocking (TSPC) flip-flop (prior art). The TSPC flip-flop (TSPC-FF) has one of the lower latencies (tDQ=tSU+tCQ), and more, it can incorporate complex logic, but suffers from a number of well-known structural problems which render it unsafe for large-scale use in commercial integrated circuits.
During the low phase of the clock, the flip-flop master is transparent, that is, changes at input D appear inverted on node mDb. But since MN2, gated by the clock, is off, the transfer of mDb to the slave is blocked; DbMF is in pre-charge, and held at Vdd by MP2. Since DbMF is high, MP3, as well as the clock-gated MN3, is off; QB is in high impedance (floating) and holds state dynamically by the charge stored on its endemic capacitance and external load.
CLK→Vdd
On the rising edge of the clock, the master becomes opaque and enters “high-impedance”; MP1 turns off, cutting off “mDb” from Vdd. However, as the pull down of the master is not clocked, mDb is allowed to transition low. It should however hold beyond the clock rising edge for a period of time (tH, D=0) sufficient for DbMF to fall.
As the master enters “high-impedance”, the slave becomes transparent; pre-charger MP2 turns off, MN2 and MN3 turn on. Thus:
If mDb is low, DbMF remains at Vdd but floats;
If mDb is high, DbMF monotonically falls but can float low if
D→1 after tH, D=0 is satisfied.
In either case, the state of DbMF is inverted and transferred to the output QB.
After a short time beyond the clock edge—characterized by tH D=0—subsequent changes at D do not change the flip-flop state (mDb is cut off from Vdd).
During the high phase of the clock, QB is driven, but mDb and DbMF are in high impedance and hold their levels dynamically, thus, to ensure a robust operation, keepers must be placed on mDb and DbMF nodes.
CLK→0
Once the clock falls, DbMF is driven to Vdd and the clock-gated MN3 is turned off, placing QB in high impedance; its state dynamically held by the stored charge on endemic capacitance and output load. Coupling to, and leakage at this node can disturb its state. As opposed to internal nodes of mDb and DbMF, a regenerative keeper placed at QB does not ensure robust operation as the state node of the slave (QB) remains exposed to external noise. The aforementioned operational characteristics of the circuit prove it to be a positive edge-triggered flip-flop.
Thus far, the absence of keepers on mDb, DbMF, and QB nodes, as well as the exposure of the state node to output disturbance, have been pointed out as problems related to the TSPC-FF. However, there are additional structural issues, such as sensitivity to clock slope—Internal race. On the falling edge of the clock, as the flip-flop master becomes transparent, the slave is turning opaque. In the following, two race conditions relating to this transition are outlined:
Race between master and slave: if D=0 when clock falls and the clock-gated MP1 turns on, mDb will transition high, activating MN1; concurrently, MN2 is turning off. For a sufficiently low clock edge rate, both transistors in the MN1/MN2 pull-down stack will be on briefly, potentially disturbing DbMF when having a logical value of “1”.
Intra-slave race: If MP2 is large, that is if DbMF pre-charges too quickly and activating MN4 before MN3 shuts off, a logical value of “1” at QB may be disturbed through the MN3/MN4 pull-down stack. Similarly, a sufficiently low clock slew will disturb the aforementioned level regardless of the size of MP2.
In addition, there may be tCQ imbalance between transitions. As DbMF is held at Vdd prior to the rising edge of the clock, there is a pronounced difference between tCQ 1→0 and tCQ 0→1 transitions: the latter must first discharge DbMF to ground. While the slower of the two transitions determines the latency of the flip-flop, the shorter tCQ places a more stringent limit on the minimum gate-delay budget between flip-flops so as to avoid race.
Further, there may be an output glitch when data does not transition. Assume that D=0 in two consecutive cycles. On the rising edge of the first cycle, DbMF is discharged causing QB to transition to a logical “1”. When clock falls, DbMF is driven back to Vdd and QB holds state. On the rising edge of the next cycle, MN2 and MN3 come on. DbMF starts to discharge, but since MN4, gated by DbMF, is initially on, QB begins to discharge. Once DbMF is below the trip-point of the final stage, QB is returned to Vdd. Thus, the output exhibits a low-going glitch: QB 1→0→1 for D=0. Though this glitch is non-destructive, it causes additional power dissipation for downstream logic.
The TSPC-FF creates a large clock load. True to its name, true single-phase clocking is devised so that the raw clock drives all three stages of the flip-flop (MP1, MN2, MP2, MN3) thus imposing a large load on the clock.
Flip-flop latency is characterized by the data-to-Q delay which is the sum of its setup time, tSU, to the rising clock edge and clock-to-Q delay, tCQ, measured from the rising edge, i.e. tDQ=tSU+tCQ.
In the case of TSPC-FF, the 1→0 data transition produces the larger delay for both tSU and tCQ. tSU is determined by the delay through the MP0/MP1 stack (mDb→1) and tCQ comprises the discharge of DbMF through MN1/MN2 stack summed with MP3 driving the output high. Therefore it can be said that the TSPC-FF latency is about 3 gate delays.
The maximum flip-flop hold time, tH, coupled with its minimum tCQ and clock skew, determines the minimum number of logic gates required between two flops to avoid hold time violation and race. The less positive the hold time, the easier it is for the logical effort to ensure a race-free operation. Maximum hold time for TSPC-FF is the time required for the data to remain low after the rising edge of the clock so that mDb succeeds in discharging DbMF; this amounts to slightly larger than 1 gate delay.
It would be advantageous if a latch could be designed to combine the improved gate delay of the TSPC-FF, while addressing the above-mentioned problems.