Many complex digital logic circuits, including processors, employ a technique called “pipelining” to perform more operations per unit of time (i.e., to increase throughput). Pipelining involves dividing a process into sequential steps, and performing the steps sequentially in independent stages. For example, if a process can be performed via n sequential steps, a pipeline to perform the process may include n separate stages, each performing a different step of the process. Since all n stages can operate concurrently, the pipelined process can potentially operate at n times the rate of the non-pipelined process.
Hardware pipelining involves partitioning a sequential process into stages, and adding storage elements (i.e., groups of latches or flip-flops, commonly called registers) between stages to hold intermediate results. In a typical hardware pipeline, combinational logic within each stage performs logic functions upon input signals received from a previous stage, and the storage elements positioned between the combinational logic of each stage are responsive to one or more synchronizing clock signals. The one or more clock signals control the movement of data within the pipeline.
Within an integrated circuit, a single global clock signal often provides a timing reference for the movement of data. FIGS. 1 and 2 will now be used to describe a timing problem inherent in known systems that distribute a global clock signal across a surface of an integrated circuit, and use local clock buffers located at different points on the surface to generate local clock signals derived from the global clock signal.
FIG. 1 is a diagram of an integrated circuit 100 including a global clock distribution system 102, a first local clock buffer (LCB) 104, and a second local clock buffer (LCB) 106. The global clock distribution system 102 is used to distribute a global clock signal across a surface of the integrated circuit 100. As indicated in FIG. 1, the local clock buffer (LCB) 104 and the local clock buffer (LCB) 106 are located at different points on the surface, and use the global clock signal to generate a first local clock signal “CLKA” and a second local clock signal “CLKB.”
In general, the local clock signals CLKA and CLKB are used to synchronize the operations of various logic structures (e.g., gates, latches, registers, and the like) of logic circuitry of the integrated circuit 100. The local clock signals CLKA and CLKB may, for example, be the two different “phases” of a two-phase clocking scheme. As is common, the two-phase clocking scheme may be used to control the operations of master-slave latch pairs positioned between the combinational logic of each pipeline stage. Such master-slave latch pairs form flip-flops. One of the local clock signals CLKA and CLKB may be provided to control inputs of the master latches of the flip-flops, and the other one of the local clock signals CLKA and CLKB may be provided to control inputs of the slave latches of the flip-flops.
As indicated in FIG. 1, the local clock buffer (LCB) 104 uses the global clock signal to generate a local clock signal “CLKA1,” one version of the local clock signal CLKA, and a local clock signal “CLKB1,” one version of the local clock signal CLKB. The local clock buffer (LCB) 106 uses the global clock signal to generate a local clock signal “CLKA2,” another version of the local clock signal CLKA, and a local clock signal “CLKB2,” another version of the local clock signal CLKB.
FIG. 1 reflects the common situation where the internal structures of the local clock buffers (LCBs) 104 and 106 differ, and timing delays within the local clock buffers (LCBs) 104 and 106 also differ. As a result, common timing points for the local clock buffers (LCBs) 104 and 106 exist within the global clock distribution system 102 as indicated in FIG. 1.
FIG. 2 is a timing diagram illustrating timing relationships between the clock signals within the integrated circuit 100 of FIG. 1. As indicated in FIG. 2, a timing difference between the local clock signal CLKA1 generated by the local clock buffer (LCB) 104 and the local clock signal CLKA2 generated by the local clock buffer (LCB) 106 represents a “skew” of the local clock signal CLKA. A similar timing difference between the local clock signal CLKB1 generated by the local clock buffer (LCB) 104 and the local clock signal CLKB2 generated by the local clock buffer (LCB) 106 represents a “skew” of the local clock signal CLKB.
As the local clock signals CLKA and CLKB are used to synchronize the operations of logic structures, the skews of the local clock signals CLKA and CLKB may result in timing problems that cause the logic circuitry of the integrated circuit 100 to produce incorrect values. For example, as described above, the local clock signal CLKA may be provided to control inputs of master latches of flip-flops separating the combinational logic of pipeline stages, and the local clock signal CLKB may be provided to control inputs of slave latches of the flip-flops. The skews of the local clock signals CLKA and CLKB may reduce an amount of time a signal derived from an output of a first flip-flop positioned at a beginning of a pipeline stage has to propagate through the combinational logic of the stage and reach a second flip-flop positioned at an end of the pipeline stage. If a cycle time (i.e., period) of the global clock signal is not made long enough, the signal may not reach the second flip-flop before the master latch “captures” the value of the signal at the input, and the flip-flop may capture an incorrect value of the signal. As a result, the logic circuitry of the integrated circuit 100 may produce one or more incorrect values.
In general, use of the different local clock buffers (LCBs) 104 and 106 to produce the local clock signals CLKA and CLKB results in relatively large skews of the local clock signals CLKA and CLKB, causes a lower bound of the period of the global clock signal to be relatively high, and thereby reduces an upper bound of a maximum performance of the logic circuitry of the integrated circuit 100.
It would thus be advantageous to have a local clock signal generation system wherein timing differences between local clock signals (i.e., local clock signal “skews”) are reduced.