1. Field of the Invention
The present invention is related to integrated circuit (IC) clock systems and more particularly to maintaining duty cycle timing balance in ICs.
2. Background Description
Large high performance very large scale integration (VLSI) chips like microprocessors are synchronized to an internal clock. A typical internal clock is distributed throughout the chip, triggering chip registers to synchronously capture incoming data at the register latches and launch data from register latches. Ideally, each clock edge arrives simultaneously at each register every cycle and data arrives at the register latches sufficiently in advance of the respective clock edge, that all registers latch the correct data and simultaneously. Unfortunately, various chip differences can cause timing uncertainty, i.e., a variation in edge arrival to different registers.
Such timing uncertainties can arise from data propagation variations and/or from clock arrival variations. Data propagation variations, for example, may result in a capturing latch that randomly enters metastability or latches invalid data because the data may or may not arrive at its input with sufficient set up time. Clock edge arrival variations include, for example, clock frequency fluctuations (jitter) and/or register to register clock edge arrival variations (skew). Both data path and clock edge arrival variations can arise from a number of sources including, for example, ambient chip conditions (e.g., local temperature induced circuit variations or circuit heat sensitivities), power supply noise and chip process variations. In particular, power supply noise can cause clock propagation delay variations through clock distribution buffers. Such clock propagation delay variations can cause skew variations from clock edge arrival time uncertainty at the registers. Typically, chip process variations include device length variations with different device lengths at different points on the same chip. So, a buffer at one end of a chip may be faster than another identical (by design) buffer at the opposite end of the same chip. Especially for clock distribution buffers, these process variations are another source of timing uncertainty.
Furthermore, as technology features continue to shrink, power bus or Vdd noise is becoming the dominant contributor to total timing uncertainty. High speed circuit switching may cause large, narrow current spikes with very rapid rise and fall times, i.e., large dI/dt. In particular, each of those current spikes cause substantial voltage spikes in the on-chip supply voltage, even with supply line inductance (L) minimum. Because V=LdI/dt, these supply line spikes also are referred to as L di/dt noise. Since current switching can vary from cycle to cycle, the resulting noise varies from cycle to cycle. When the Vdd noise drops the on-chip supply voltage in response to a large switching event, can slow the entire chip including both the clock path (clock buffers, local clock blocks, clock gating logic and etc.) as well as the data path logic (combinational logic gates, inverters and etc.). Vdd noise can also be very localized in its impact, depending on many factors such as the robustness of the power distribution grid. When the noise dissipates and the on-chip supply later recovers, or even overshoots as the supply current falls; then, the circuits (buffers, gates and etc.) in these same paths speed up, returning to their nominal performance (with the normal stage delay) or even faster. The number of stages that can complete changes as the data path slows down or speeds up relative to the clock path. Currently, in particular, such switching noise is the dominant component of total timing uncertainty, more even than skew or jitter (which are themselves affected by switching noise) or chip process variations. Thus, it would be useful to be able to determine switching noise and how it affects circuit performance
Clock skew and jitter, power supply noise and chip ambient and process variations may be considered the primary sources of timing uncertainty. In particular, the overall or total timing uncertainty is a complex combination of both clock and data path uncertainty that reduces the number of combinational logic stages (typically called the fan out of 4 (FO4) number) that can be certifiably completed in any clock cycle and so, reduces chip performance. The FO4 number is the number of fan-out of four inverter delays that can fit in one cycle. This design parameter serves to determine chip pipeline depth, e.g., in a microprocessor. By design, register latch boundaries are determined by the maximum number of logic stages (FO4) that may be guaranteed to be completed in every clock cycle. Typically, designers apply some guard band number to the FO4 number (i.e., reduce the FO4 number by some delta) to account for timing uncertainties. Previously, this delta was a guess of how the number of combinational logic stages that can be completed had changed from cycle to cycle. If the guess was too high, chip problems would result. If not, there was no way to determine if that guess was too low and by how much.
Furthermore, state of the art microprocessors, for example, use what is known as clock doubling for additional performance improvement. Typical clock doubling triggers circuits off each clock transition with the on-chip clock period being the time between such transitions. Clock duty cycle is the percentage of the clock cycle that the clock signal is high. A duty cycle that is 50% is balanced with the time between transitions being equal. Consequently, these state of the art microprocessors, especially, require a well-controlled, balanced duty cycle. Unfortunately, while typical state of the art phase locked loop (PLL) circuits rely on analog duty cycle monitoring/correction of the clock signal output, these typical PLLs do not correct duty cycle distortion that the clock distribution tree/buffers introduce, which requires designers to account for expected duty cycle imbalance, e.g., by “guardbanding” or foreshortening the logic paths to accommodate for expected half cycle foreshortening. So, while the clock frequency may have doubled, performance is lost frequently by guardbanding for an unbalanced duty cycle.
Thus, there is a need for a way to measure clock duty cycle and adjust on-chip clocks to maintain a balanced duty cycle.