1. Field of the Invention
The present invention relates generally to the field of digital circuit design; and, more particularly, to a method for designing a synchronous digital circuit which exploits clock skew so as to reduce EMI and IR-drop in the circuit.
2. Description of the Prior Art
In the design of synchronous digital circuits, clock signals are used to synchronize computations performed in the circuits. The function of the clock signals is to reduce the uncertainty in delay between sending and receiving storage elements in the circuit. Storage elements, such as latches and flip-flops, for example, sample output signals of combinatorial logic, internally preserve the values as the state of the circuit, and make the state available for new computations after a certain delay.
A storage element makes its internal state available by driving its output signal to a corresponding voltage level. When a new voltage level is higher than a previous voltage level, current is briefly drawn from a voltage supply to charge the signal capacitance. Conversely, when the new level is lower than the previous level, current is briefly dumped into the ground network.
Most currently utilized schemes for distributing clock signals to storage elements in digital circuits concentrate on ensuring a high degree of synchronicity of all clock signals. To achieve this, clocks are typically distributed in a tree-like structure in such a manner that delays in different branches of the structure can be balanced to a high degree. The major benefit of such schemes is that uniformity brings predictability and simplifies the overall problems associated with designing the circuit.
Alternatives to maximally-balanced clock distribution networks are also known in the art, but are less frequently utilized. For example, in unidirectional pipelines, it is common practice to distribute the clock signal in a direction opposite to the data flow. Complex ASIC (Application Specific Integrated Circuit) designs, however, are rarely suitable for this method, inasmuch as their data flow is complex and irregular. Performance tuning through intentional clock skew is also used, either through explicit designer decisions to re-distribute computation time between two pipeline stages, or through the use of special CAD tools, such as the tool xe2x80x9cClockWisexe2x80x9d offered by Ultima Interconnect Technologies.
An effect that is encountered in highly balanced clock distribution networks is that the outputs of all storage elements in the network are caused to toggle virtually simultaneously. As a result, the capacitive loads driven by the flip-flop outputs are also charged virtually simultaneously, thus briefly drawing a large current (a xe2x80x9ccurrent spikexe2x80x9d) from the supply. Such current spikes are undesirable for several reasons. For example, some significant problems that are caused by current spikes include:
Metal migration in supply wires is a major reliability problem. The rate of migration depends strongly on the maximum current density which occurs in the wire. Large current spikes thus require wider supply wires with the concomitant cost in area.
Large current spikes feature large values of dI/dt. Together with the parasitic inductance present in the IC package, the current spikes thus cause voltage fluctuations on the supply lines. These fluctuations can cause both a malfunction of the digital circuits and reduced performance level of co-located RF circuitry. Means to address these problems include advanced packaging and on-chip decoupling capacitance, both of which increase costs.
The large current spikes themselves can couple inductively into other parts of the design and cause a malfunction or a reduction in performance.
In addition to the above operational problems, highly balanced clock distribution schemes frequently cause problems during design of the circuits. For example, the arrival times of the clock signals to different registers depend on the detailed layout of the circuits; and is, therefore, difficult to predict. Practical design methods, therefore, usually include a margin of error to account for this imprecision. The introduction of this margin of error, however, reduces the maximum performance of the circuits; and, thus, high-performance circuits tend to be designed with small uncertainty margins for clock signal arrival, increasing the demands on the layout extraction process and complicating timing convergence.
The present invention recognizes that by not balancing the clock tree; or, by deliberately making the clock tree unbalanced or skewed, problems such as those described above can be significantly reduced.
In particular, the present invention provides a method for designing a digital circuit that includes a plurality of storage elements connected to combinational logic, each of the plurality of storage elements being driven by a clock signal distributed to the plurality of storage elements from a clock device. A method, according to the invention comprises the step of substantially maximizing clock skew in the circuit subject to one or more constraints on the design of the circuit.
Basically, the present invention uses the set of permissible ranges for clock skew, preferably calculated by means of Static Timing Analysis, to calculate a robust clock skew schedule that gives the circuit good EMI and IR-drop properties. In effect, a certain amount of robustness is decided upon, and then the rest of the permissible range is used to reduce EMI and IR-drop.
According to an embodiment of the present invention, the clock signal is subjected to certain insertion delays as it is distributed to each of the plurality of storage elements, and the step of substantially maximizing clock skew comprises selecting values for the insertion delay to each of the plurality of storage elements such that the variance of the insertion delays among the storage elements is maximized subject to the one or more constraints.
According to a presently preferred embodiment of the invention, the one or more constraints include robustness constraints constraining the insertion delays such that overall global robustness reaches a maximum value allowed by loops and combinational logic blocks in the circuit, and external scheduling constraints. Since the variance is a quadratic form, Quadratic Programming is preferably used to find a set of values for the insertion delays which maximize the variance given the robustness constraints such as indicated above.
According to further embodiments of the present invention, maximizing the variance of the insertion delays among the storage elements can be made subject to other constraints on the design of the circuit. For example, constraints may be included to reduce gate count and routing congestion. Also, some circuits designed according to the present invention may contain loops which are too closely coupled, and this may limit the amount of robustness that can be reached. According to other embodiments of the invention, this problem can be helped by adding buffers to the circuit at appropriate locations to make the shortest paths in the circuit longer, to optimize logic for maximizing the min-delays or by imposing opposite edge devices, flip-flops or latches at appropriate locations in the circuit to increase scheduling freedom.
A system that uses a clock distribution system designed in accordance with a method of the present invention will be more robust against uncontrollable clock skew than a system that utilizes the maximally-balanced schemes that prevail in the prior art. The system will also have better EMI and IR-drop properties than a system designed using the maximally-balanced scheme or a system designed such that clock skew is optimized for performance only.
Yet additional objects, features and advantages of the present invention will become apparent hereinafter in conjunction with the following detailed description of presently preferred embodiments.