1. Field of the Invention
This invention relates to the field of electronic circuit design, and in particular to a pulse-triggered D-Flip-Flop (P-DFF) that utilizes a cascode voltage switch to achieve minimal set-up time and propagation delay, while also consuming minimal power.
2. Description of Related Art
A Data-Flip-Flop (DFF) is configured to xe2x80x9creadxe2x80x9d a data input at a particular point in each clock cycle. The output of the DFF provides the value that was read, independent of subsequent changes, or noise, on the data input, until the next data value is read. The data input must be stable while it is being read into the DFF, else the read value may be indeterminable. Ideally, the reading of the data input occurs instantaneously, so that the sensitivity of the DFF to changes on the data input is minimized. Also ideally, the instantaneous read occurs at exactly the same point within each clock cycle.
Pulse-triggered latches and flip-flops are commonly used in the art to approximate the ideal performance of a DFF as closely as possible [1]. In a pulse-triggered latch, a pulse generator provides a narrow pulse at each rising or falling (active) edge of a clock. While the pulse is asserted, the signal on a data input line is communicated to the output of the latch. While the pulse is not asserted, the output of the latch remains unchanged. In order to maximize the stability of the output, and to reduce the stability requirements on the data input, the width of the asserted pulse is kept as narrow as possible.
1. Vladimir Stojanovic and Vojin G. Oklobdzija, xe2x80x9cComparative Analysis of Master-Slave Latches and Flip-Flops for High-Performance and Low-Power Systemsxe2x80x9d, IEEE Journal of Solid-State Circuits, Vol 34, No. 4, April 1999, pp 536-548, and incorporated by reference herein. 
The performance of a DFF is assessed in terms of its cycle delay, or xe2x80x9csequencing overheadxe2x80x9d, and its power consumption. The sequencing overhead is defined herein as the minimum time required to read the data into the device and to produce a stable output corresponding to this data input. This includes any set-up requirements imposed on the data input to assure a reliable read of the data value, plus the time required to propagate the data input to the output of the device. This sequencing overhead corresponds, inversely, to the maximum speed that a serial string of DFFs can be reliably operated. If the DFF includes additional internal logic, such as scan logic that is used for testing the device, the sequencing overhead includes the impact, if any, that the additional internal logic imposes on the propagation of the data input to the output of the DFF during normal (i.e. performance) operation. The power consumption of a DFF typically depends upon the energy required to change the state of the elements within the DFF, and hence, is typically dependent upon pattern of data values read by the DFF. Generally, the power consumption of a DFF is estimated based upon an assumed random data input pattern to the DFF.
FIGS. 1-3 illustrate example prior art pulsed-D-Flip-Flops. In FIG. 1, an example xe2x80x9chybrid-latchxe2x80x9d flip-flop (HLFF) is illustrated [2, 3] that achieves a high speed performance via a pre-charging of the internal nodes 101 of the flip-flop to avoid the delay associated with changing the value of the internal nodes to the pre-charged value when the device is clocked to read in the data. When the clock (CLK) signal is low, the p-channel device 121 conducts, thereby precharging the internal node 101 to a high state. This internal high state has no effect on the output Q, because the low clock signal also places the n-channel device 132 into a non-conducting state, thereby precluding a discharge of the voltage at Q. Also, while the clock signal is low, the inverting delay logic 110 places the n-channel devices 124 and 134 into a conducting state.
2. ibid, FIG. 17. 
3. Draper et al., xe2x80x9cCircuit Techniques in a 266-MHz MMX-enabled processorxe2x80x9d, IEEE Journal of Solid-State Circuits, Vol 32, November 1997, pp 1650-1664, and incorporated by reference herein. See FIG. 10. 
When the clock signal goes high, the p-channel device 121 is placed in a non-conducting state, and device 122 in a conducting state. Because, initially, devices 122 and 124 are in a conducting state, the value of the data signal at the gate of n-channel device 123 determines the state of the internal node 101. If the data signal is low, the internal node 101 remains at a high state; if the data signal is high, the internal node 101 is discharged through the serial path of devices 122, 123, and 124 to a low state. Also when the clock signal initially goes high, devices 132 and 134 are in a conducting state, and the inversion of the state of the internal node 101 is communicated to the output Q.
The asserted clock signal propagates through the inverted delay logic 110, and after approximately three gate-time delays, the high value at the clock produces a low value at the gates of devices 124 and 134, placing each of them in a non-conducting state. In this non-conducting state, neither the internal state 101 nor the output Q can be discharged to a low state. Because the internal state 101 cannot be discharged to a low state, the state of the p-channel device 131 cannot change. If the internal state 101 had been low, device 131 would have been conducting, and the output Q would be in a high state, and will remain in this high state because the device 134 is in a non-conducting state. If the internal state 101 had been high, device 131 would have been non-conducting, and the output Q would have been in a low state (via 132, 133, 134 when the clock initially goes high). The internal state 101 will remain in this high state because device 124 is non-conducting.
When the clock again goes low, the internal state 101 is again precharged to a high state. This precharging has no effect on the output Q, because the device 132 is non-conducting when the clock signal is low and cannot discharge the output Q if it is currently in the high state. The precharging of the internal node 101 places device 131 into a non-conducting state, and thus cannot charge the output Q if it is currently in the low state.
The internal state 101 is also precharged if the data input value is in a low state, via the p-channel device 141, regardless of the state of the clock. This precharging cannot affect the output Q unless both devices 132 and 134 are conducting, which occurs only during the intended time for the data input to be propagated to the output Q.
The cross-coupled inverters 140 provide a complementary output Qn, and provide an additional margin of stability to the output Q during transitions in the above described process, or during long periods of clock inactivity.
As described above, the state of the internal node is dependent upon the data signal only during the period that both n-channel devices 122 and 124 are conducting. This time of mutual conduction is determined by the delay block 110. The delay time of the delay block 110 is set to be as short as possible, while still assuring that the value on the data line will be propagated to the output Q. Because the internal node 101 is precharge to a high state, the delay for propagating a data low state is merely the delay of the n-channel device 132 for discharging the output Q to a low state, if it is not already in the low state. The delay for propagating a data high state is the delay of the n-channel device 122 for discharging the internal node 101, plus the delay of the p-channel device 131 for charging the output node Q to a high state, if it is not already in the high state. Note, however, that the delay of the device 110 need only be long enough for the n-channel device 122 to discharge the internal node 101 via the data-controlled device 123, or for the n-channel device 132 to discharge the output Q via the internal-node-controlled device 133. The hold time of a data high input, the time for which the data must remain high, will be slightly greater than the delay time of the device 110, so as not place the p-channel device 141 into a conductive state until the output Q is brought to a logic high state.
The amount of energy consumed by the HDFF of FIG. 1 is dependent upon the number of times each node is charged or discharged. If the data input is a constant low state, very little energy is consumed, because the internal node 101 remains at a high state, and the output Q remains at a low state. If, on the other hand, the data input is a constant high state, the internal node will be continually pre-charged and discharged. Thus, even during periods of inactivity, energy will be consumed, if the inactive period corresponds to the data input being high. During normally active periods, the average energy consumption is comparable to conventional static (i.e. non-precharged) flip-flop structures.
FIG. 2 illustrates an example semidynamic flip-flop SDFF [4], which also uses a pre-charging technique to achieve high speeds. The delay block 210 serves a similar function to the delay block 110 of FIG. 1 of enabling a propagation of the data input signal to the internal node 201 only during a short time period after the clock transitions from a low to high state. The NAND gate 211 is configured to place the n-channel device 222 into a non-conducting state as soon as the internal node 201 is pulled low (via a high data input), thereby eliminating the aforementioned requirement of holding the data input at a high state for a duration longer than the delay of the device 210. In effect, the device 210 is a self-regulating device that automatically limits the sensitivity of the SDFF to the pre-set delay associated with the device 210, or to the actual time required to propagate the data input to the internal node, whichever is less. The cross-coupled inverters 140, 240 serve to stabilize the output Q and the internal node 201 during transitions, or during long periods of clock-inactivity.
4. Stojanovic, op cit, FIG. 18. 
Because of the pre-charging process, the SDFF of FIG. 2 exhibits similar energy-consuming characteristics to the HDFF of FIG. 1, particularly with regard to a continuous high data input. The SDFF structure, on the other hand, is better suited for embedded logic functions than the HDFF structure. The embedded logic allows the flip-flop to effect other functions, in addition to the clocked-D-to-Q function of a flip-flop, including asynchronous or synchronous sets and resets, the inclusion of scan-test logic, and so on.
FIG. 3 illustrates an example edge-triggered latch (ETL) that includes self-resetting logic [5, 6]. In operation, the internal nodes are precharged to a logic high state, via the resetting logic 390. The resetting logic 390 has a specified delay. Whenever the Q and Qn signals differ, and after the specified delay, the resetting logic 390 places the p-channel devices 321, 331 into a conductive state, which automatically resets the internal nodes 301, 302 to a logic high state. Note that, because the Q and Qn signals are directly coupled to the internal nodes 301, 302, these nodes will both be reset to a logic low state, and thus devices that are configured to read the Q or Qn values associated with the information-state of the ETL must be configured to read the Q or Qn values before these nodes are automatically reset.
5. Draper, op cit, FIG. 12. 
6. Stojanovic, op cit, FIG. 19. 
The delay logic 310 operates similar to the delay logic 110 of FIG. 1, and sensitizes the ETL to the data input only during the delay time of the device 310 after the rising edge of the clock (Clk). If the data input is high, the internal node 301 is brought low at the rising edge of the clock, and the output Q is brought high. If the data input is low, the internal node 302 is brought low at the rising edge of the clock, and the output Qn is brought high. The change of state of either of the outputs Q, Qn to a high state initiates the aforementioned automatic reset process, which resets the outputs Q and Qn to a low state, after the reset delay period.
When the outputs Q and Qn are both brought to the low state, and after another reset delay period, the devices 321 and 331 are brought to a non-conducting state. The cross-coupled p-channel devices 341 assure that the xe2x80x98inactivexe2x80x99 node is maintained at the high state when the opposite node is pulled low when the data input is read. The cross-coupled inverters 342, 343 stabilize the outputs Q and Qn between the rising edge of the clock and the time of reset.
Note that, because both internal nodes 301, 302 are pre-charged to a high state at every clock cycle, and one of them is discharged at every clock cycle, the ETL consumes a substantial amount of energy, independent of the pattern of values at the data input. Additionally, the dynamic operation of the ETL is incompatible with non-dynamic/static circuits that assume a stable output after the output is set to its intended state.
It is an object of this invention to provide a high-speed flip-flop that consumes minimal power. It is a further object of this invention to provide a high-speed flip-flop that is static. It is a further object of this invention to provide a flip-flop structure that facilitates additional logic functions within the flip-flop.
These objects and others are achieved by providing a differential cascode structure that is configured to propagate a data state to a static latch at each active edge of a clock. A clock generator enables the communication of the data state and its inverse to the latch for a predetermined time interval. In a first embodiment, each cascode structure includes three gates in series, the gates being controlled by the clock signal, a delayed inversion of the clock signal, and the data state or its inverse. In an alternative embodiment, each cascode structure includes two gates in series, the gates being controlled by the clock signal and the delayed inversion of the clock signal. In this alternative embodiment, each of these cascode structures is driven directly by the data signal or its inverse. The static latch obviates the need to precharge nodes within the device, thereby minimizing the power consumed by the device. The latch preferably comprises cross-coupled inverters, which, being driven by the differential cascode structure, enhance the switching speed.