Master-slave latches are employed commonly in integrated circuit design. In a master-slave latch, a master latch latches data in response to a first clock signal, and a slave latch coupled to the master latch latches data (latched by the master latch) in response to a second clock signal. Typically the first and second clock signals are approximately complimentary (e.g., 180 degrees out of phase).
While a pulsed mode of operation reduces power consumption, such a mode of operation is susceptible to a number of problems. If the pulse employed to latch data into the slave latch is too wide, the master-slave latch may be susceptible to early mode problems such as race through (e.g., as both master and slave latches are active simultaneously for the duration of the slave latching pulse). Likewise, if the pulse employed to latch data into the slave latch is too narrow, data may not be reliably latched by the slave latch. Accordingly, designing and implementing a pulsed mode of operation for a master-slave latch is difficult, and often requires multiple design and test iterations.
Many complex digital logic circuits, including processors, employ a technique called “pipelining” to perform more operations per unit of time (i.e., to increase throughput). Pipelining involves dividing a process into sequential steps, and performing the steps sequentially in independent stages. For example, if a process can be performed via n sequential steps, a pipeline to perform the process may include n separate stages, each performing a different step of the process. Since all N stages can operate concurrently, the pipelined process can potentially operate at N times the rate of the non-pipelined process.
Hardware pipelining involves partitioning a sequential process into stages, and adding storage elements (i.e., groups of latches or flip-flops, commonly called registers) between stages to hold intermediate results. In a typical hardware pipeline, combinational logic within each stage performs logic functions upon input signals received from a previous stage, and the storage elements positioned between the combinational logic of each stage are responsive to one or more synchronizing clock signals. The one or more clock signals control the movement of data within the pipeline.
Within an integrated circuit, a single global clock signal often provides a timing reference for the movement of data. Various circuits have been used to distribute a global clock signal across a surface of an integrated circuit and local clock buffers located at different points on the surface are used to generate local clock signals derived from the global clock signal.
A global clock distribution system is used to distribute a global clock signal across a surface of the integrated circuit. In one prior art example, a first local clock buffer and a second local clock buffer are located at different points on the surface of the IC and receive the global clock signal and generates exemplary first and second local clock signals, “CLK_A” and “CLK_B” respectively.
In general, the local clock signals CLK_A and CLK_B may be used to synchronize the operations of various logic structures (e.g., gates, latches, registers, and the like) of logic circuitry of the integrated circuit. The local clock signals CLK_A and CLK_B may be two different “phases” of a two-phase clocking scheme. As is common, the two-phase clocking scheme may be used to control the operations of master-slave latch pairs positioned between the combinational logic of pipeline stages. Such master-slave latch pairs form flip-flops. One of the local clock signals CLK_A and CLK_B may be provided to control inputs of the master latches of the flip-flops, and the other one of the local clock signals CLK_A and CLK_B may be provided to control inputs of the slave latches of the flip-flops. The local clock buffers may also use the global clock signal to generate a local clock signal to generate additional versions of CLK_A and CLK_B. The internal structures of the local clock buffers may differ leading to timing delays between the local clocks. Generating additional versions of the local clocks may lead to skews which adds to the timing problems.
As the local clock signals CLK_A and CLK_B are used to synchronize the operations of logic structures, the skews of the local clock signals CLK_A and CLK_B may result in timing problems that cause the logic circuitry of the integrated circuit to produce incorrect values. For example the local clock signal CLK_A may be provided to control inputs of master latches of flip-flops separating the combinational logic of pipeline stages, and the local clock signal CLK_B may be provided to control inputs of slave latches of the flip-flops. The skews of the local clock signals CLK_A and CLK_B may reduce an amount of time a signal derived from an output of a first flip-flop positioned at a beginning of a pipeline stage has to propagate through the combinational logic of the stage and reach a second flip-flop positioned at an end of the pipeline stage. If a cycle time (i.e., period) of the global clock signal is not made long enough, the signal may not reach the second flip-flop before the master latch “captures” the value of the signal at the input, and the flip-flop may capture an incorrect value of the signal. As a result, the logic circuitry of the integrated circuit may produce one or more incorrect values.
Therefore, there is a need for programmable circuitry to reduce or compensate for the skew in local clocks as well as generating a programmable pulse clock whose pulse width may be used to optimize local clock timing.