1. Field of the Invention
The present invention relates to data flip-flops that may be used in data registers of pipeline stages, and more particularly to a teacher-pupil flip-flop that has a significantly decreased register delay time thereby increasing the amount of cycle time that is available to perform work during each cycle of a pipelined device.
2. Description of the Related Art
FIG. 1 is a block diagram illustrating the relationship between register delay and work intervals in a pipelined device 100 with successive stages separated by conventional D-type flip-flops 105, 106 and 107. A first stage 101 (STAGE N) including pipeline stage logic 102 is shown coupled to a second stage 103 (STAGE N+1) including pipeline stage logic 104. It is understood that additional stages may be included, such as prior stages before the stage 101 and later stages after stage 103. Data is passed from one stage to the next upon transition of a clock signal CLK. It is common practice within the art to transmit the output of one stage to the input of a following stage through a data register, where each data register includes one or more D-type flip-flops. Each D flip-flop handles one data bit and includes a clock input receiving the CLK signal.
As shown in FIG. 1, the first D flip-flop 105 receives a data signal X at its D input and provides a registered version of the X signal, or a data signal RX, at its Q output. The D flip-flop 105 may also include an inverted output, QB, in which it provides an inverted version of the RX signal, or signal RXB, at its QB output. A “B” appended to the signal or input/output (I/O) name denotes a complementary signal in which the complementary signal has an inverted or opposite logic state. The RXA and RXB signals are provided to the pipeline stage logic 102, which develops an output signal Y. The Y signal is provided to the D input of the second D flip-flop 106 located between the stages 101 and 103, where the D flip-flop 106 generates RY and RYB signals at its Q and QB outputs, respectively. The RY and RYB signals are processed by the pipeline stage logic 104, which develops an output signal Z provided to the D input of the third D flip-flop 107. The D flip-flop 107 generates RZ and RZB signals at its Q and QB outputs, respectively, and so on.
The state of a signal on the D input of the D flip-flop just prior to the clock transition is latched on the D flip-flop's Q and QB outputs just after the transition of the CLK signal. A finite amount of time, referred to as the REGISTER DELAY, elapses while the register passes the data from one stage to the next. As shown, each of the D flip-flops 105-107 incurs a REGISTER DELAY for conveying data between stages. The CLK signal determines the total amount of time available for each cycle. Each pipeline stage logic of the pipelined device 100, including the pipeline stage logic 102 and 104, performs functions during each cycle of the CLK signal. During the REGISTER DELAY time period, however, pipeline stage logic is not able to perform any functions. The time available to perform useful work during each cycle, referred to as the WORK INTERVAL, is equal to the overall cycle time of the CLK minus REGISTER DELAY. Hence, the pipelined device 100 is limited by the REGISTER DELAY that is required between cycles of the CLK signal.
FIG. 2 is a schematic diagram illustrating a conventional master-slave D flip-flop 200 according to prior art, representing any of the D flip-flops 105-107. The master-slave D flip-flop 200 features two substantially identical stages, including a master stage 201 followed by a slave stage 203. The master stage 201 includes a complementary pass gate 205 and a pair of inverters 207 and 209. The slave stage 203 also includes a complementary pass gate 211 and a pair of inverters 213 and 215. A P-channel device P1 and an N-channel device N1 form the complementary pass gate 205, in which the source of P1 is coupled to the drain of N1 and the source of N1 is coupled to the drain of P1. The D input is formed at the connection of the source of P1 and the drain of N1. The connection of the drain of P1 and the source of N1 is coupled to the input of the inverter 207 and to the output of the inverter 209. The output of the inverter 207 is coupled to the input of the inverter 209 and forms an input DI to the slave stage 203. The complementary pass gate 211 is formed by a P-channel device P2 and an N-channel device N2 coupled to each other in the same manner as P1 and N1, where the connection of the source of P2 and the drain of N2 forms the DI input. The connection of the source of N2 and the drain of P2 is coupled to the input of the inverter 213 and to the output of the inverter 215. The Q output of the master-slave D flip-flop 200 is formed at the output of the inverter 213, which is coupled to the input of the inverter 215.
Complementary opposite clock signals CLK and CLKB drive the successive stages of the D flip-flop 200. In particular, the CLK signal is provided to the gates of P1 and N2 and the CLKB signal is provided to the gates of P2 and N1. When CLK is low, the data on the D input is transmitted through the complementary pass gate 205 and the master inverter 207 and is setup to the DI input of the complementary pass gate 211 of the slave stage 203. The inverter 209 operates with inverter 207 as a keeper circuit to latch the data. When the CLK signal goes high, the complementary pass gate 205 closes and the complementary pass gate 211 opens, enabling the data to flow through the complementary pass gate 211 and the slave inverter 213 to the Q output. The inverter 215 operates with inverter 213 as a keeper circuit to latch the data at the Q output. The amount of time that elapses while the D input flows through the master stage 201 is called SETUP time and the amount of time required for the output of the master stage 201 to flow through the slave stage 203 to the output Q is called the CLOCK-TO-OUTPUT time. The sum of the SETUP and CLOCK TO-OUTPUT times is the REGISTER DELAY for the master-slave D flip-flop 200 when used as the D flip-flops 105-107 of the pipelined device 100.
FIG. 3 is a timing diagram illustrating the SETUP and CLOCK-TO-OUT times with respect to the CLK signal for the master-slave D flip-flop 200 of FIG. 2. The CLK signal and the states of the D input node and the Q output node are shown distributed along the vertical or Y-axis and plotted versus time along the horizontal or X-axis. As shown, successive data values DATA1 and DATA2 are asserted on the D input node. Prior to a rising edge 301 of CLK at time T1, the DATA1 value applied to the D input node must flow through the master stage 201 to the pass gate 211 of the slave stage 203. Thus, the minimum time that is required for the DATA1 value to flow through the master stage 201 is shown as the SETUP time between times T0 and T1. The DATA1 value must be valid at the D input prior to the beginning of the SETUP time at time T0. The pipeline stage logic in the previous stage must have completed its work and provided the DATA1 value to the D input prior to time T0 so that the required SETUP time of the master-slave D flip-flop 200 is met.
Similarly, following the rising clock edge 301, the DATA1 value flows through the slave stage 203 to Q output during the CLOCK-TO-OUTPUT time from time T1 to time T2, otherwise known as the output propagation time. The DATA1 value on Q output node is not valid until after the output propagation time has transpired, which is the amount of time required for the DATA1 value to flow through the complementary pass gate 211 and the inverter 213 of the slave stage 203. The pipeline stage logic in the following stage cannot begin work until after the output propagation time has elapsed to ensure processing valid data. At the present state of the art, for CLK cycle times roughly on the order of 0.5-1.0 nanoseconds (ns), the delay through a conventional register, such as employing the master-slave D flip-flop 200, is approximately 100 picoseconds (ps) which is evenly divided between the SETUP and CLOCK-TO-OUTPUT times.
It is clear from the discussion above with reference to FIGS. 1-3 that a reduction of the REGISTER DELAY enables logic within the pipeline stages to perform additional work. Alternatively, the overall speed of a pipelined device, including the pipelined device 100, is increased by decreasing the REGISTER DELAY between stages.
FIG. 4 is a schematic diagram of a master-slave flip-flop circuit 400, which is disclosed in U.S. Pat. No. 5,656,962, entitled “Master-Slave Flip-Flop Circuit with Bypass” to Banik. The master-slave flip-flop circuit 400 addressed the issue of REGISTER DELAY by providing a bypass stage 405 to significantly reduce the CLOCK-TO-OUTPUT time. The master-slave flip-flop circuit 400 is similar to the master-slave D flip-flop 200 and includes an identical master stage 401 followed by a slave stage 403. The slave stage 403 is similar to the slave stage 203, except it includes an additional inverter 407 followed by an additional complementary pass gate 409 inserted before the Q output node. The bypass stage 405 includes an inverter 411 having an input coupled to the intermediate junction between the complementary pass gate and inverter of the master stage 401 and an output coupled to one side of another complementary pass gate 413. The other side of the complementary pass gate 413 is coupled to the Q output node.
The bypass stage 405 essentially operates to bypass the slave stage 403 when the CLK signal goes high, thus exhibiting a CLOCK-TO-OUTPUT time equivalent to the delay through the pass gate 413 of the bypass stage 405. The slave stage 403 latches the data value applied to the D input node when the CLK signal is high and takes over driving the Q output when the CLK signal is low. The master-slave flip-flop circuit 400 has a SETUP time commensurate with the conventional master-slave flip-flop circuit 200 and has a reduced CLOCK-TO-OUTPUT time. With reference to FIG. 3, for example, the output data on the Q output node is valid relatively quickly after the rising edge 301 thereby reducing the overall REGISTER DELAY. The master-slave flip-flop circuit 400 may be useful for certain operations where CLOCK-TO-OUTPUT time is a critical factor.
Although the master-slave flip-flop circuit 400 has a reduced CLOCK-TO-OUTPUT time, this comes at the expense of valuable component real-estate and increased power consumption. Note, for example, that the master-slave flip-flop circuit 400 drives its output through the complementary pass gates 409 and 413. FIG. 5 is a schematic diagram of an exemplary output circuit 500 that may be employed by the master-slave flip-flop circuit 400. An INPUT signal is provided to the gates of complementary devices P and N coupled in series between a voltage source VDD and ground. The junction between the P and N devices is coupled to one side of a complementary pass gate 501, having its other side driving the OUTPUT signal. One of ordinary skill in the art will appreciate that the drive strength of a device is linearly proportional to device width and inverse linearly proportional to device length. Driving an output through a pass gate effectively doubles the length of the output device. Hence, to drive a load equivalent to that of a conventional D flip-flop, such as the master-slave flip-flop circuit 400, the inverters 407 and 411 of the master-slave flip-flop circuit 400 must be doubled in width, resulting in a four-fold increase in size of each output inverter. Also, the master-slave flip-flop circuit 400 has two output inverters, substantially increasing overall size of each flip-flop of each register between each stage of the pipelined device 100. Practical implementations of the master-slave flip-flop circuit 400 are costly in terms of size and power consumption.
It is desired to provide a register device with reduced register delay without significant increase in expense in terms of real-estate and power.