Distribution of clocks, data and other signals is an important aspect of electronic circuit design. For example, in a conventional approach to synchronous circuit design, a designer generally strives to make the clock signal arrive at all memory elements simultaneously. This approach will be described in conjunction with FIG. 1.
FIG. 1 shows a synchronous electronic circuit 100 having three memory elements, namely, D-type edge-triggered flip-flops (FFs) denoted F1, F2 and F3. These memory elements may represent embedded elements of an FPGA, FPSC, ASIC or other type of circuit. The circuit 100 further includes three 1 nanosecond (ns) delay elements 102-1, 102-2 and 102-3 arranged in series between the Q output of F1 and the D input of F2, and a single 1 ns delay element 102-4 between the Q output of F2 and the D input of F3. Elements 104-1 and 104-2 denote respective signal delays x1 and x2 associated with distribution of the clock signal to the respective clock inputs of F1 and F2.
In the circuit 100, if the clock signal arrives at the clock inputs of F1, F2 and F3 at the same time (i.e., x1, x2=0 ns), and if it is assumed for simplicity that both the clock-to-Q time and the setup time of the FFs are 0 ns, the circuit will operate correctly at a clock period of 3 ns. If on the other hand there is a difference in the clock arrival times, a situation commonly referred to as “clock skew,” the performance of the circuit may be degraded. For example, if the clock arrives 1 ns earlier at F2 than at F1 (i.e., x1=1 ns, x2=0 ns), then the clock period must be increased to 4 ns to ensure correct operation of the F1 to F2 path.
It is also possible that non-zero clock skew can improve circuit performance. For example, if x1=0 ns and x2=1 ns in the circuit 100, the clock period can be reduced from 3 ns to 2 ns. This is an example of a type of technique commonly referred to as “cycle stealing.” In the example, the technique lowers the clock period by transferring cycle time from a path that has a surplus allotment (the F2 to F3 path) to a path with a deficit (the F1 to F2 path) Cycle stealing is also referred to as clock skew optimization, clock skew scheduling, or time stealing. It is typically implemented at a point in a circuit design or configuration process after completion of place and route operations, when the timing of clock and data paths is very accurately known.
In the foregoing example, cycle stealing is implemented on a localized ad hoc basis. However, it is preferable in many applications to optimize the performance of a sequential circuit by manipulating substantially all of its clock delays as variables under the control of a single algorithm. One such algorithm is known as the Bellman-Ford algorithm, and is described in, e.g., T. H. Cormen et al., “Introduction to Algorithms,” McGraw-Hill, 1990, and R. B. Deokar et al., “A graph-theoretic approach to clock skew optimization,” Proc. ISCAS, pp. 1.407-1.410, 1994, which are incorporated by reference herein.
A problem with these and other conventional implementations of cycle stealing is that in certain circumstances they may fail to provide sufficient performance improvements, particularly for applications involving FPGAs and FPSCs. A need therefore exists for improved cycle stealing techniques which overcome the drawbacks associated with the conventional Bellman-Ford algorithm and other similar algorithms.