Programmable logic devices (PLDs) (or field programmable gate arrays (FPGAs)), are integrated circuit devices with configurable logic networks linked together by programmable interconnection resources. The configurable logic networks may include device elements such as logic cells (e.g., look-up tables (LUTs) or product term logic), memory cells, and input-output cells. Registers (e.g., D-type flip-flops) may be associated with one or more of the device elements. The registers hold and transfer data signals (i.e., variables) between the device elements during PLD operation.
The number of device elements in modern-day PLDs can be very large. These device elements are often architecturally organized into blocks of programmable logic (e.g., gate array or logic array block (“LAB”)), blocks of input-output cells, and blocks of memory (e.g., random access memory (“RAM”)), etc. Groups of these blocks may make up larger blocks (e.g., “MEGALABs”) that are arranged, for example, in an X-Y array. The programmable interconnection resources of the PLD often are organized as rows and columns of conductors for selectively routing signals to, from, and between the logic, input-output, and memory blocks.
In addition to the programmable interconnection resources for routing data signals, the PLDs include one or more clock networks for distributing “timing” or “clock” signals across the PLD to each individual device element. See, for example, Cliff et al. U.S. Pat. No. 5,550,782, Cliff et al. U.S. Pat. No. 5,689,195, and Jefferson et al. U.S. Pat. No. 6,215,326, all of which show PLD architectures developed by Altera Corporation of San Jose, Calif.; but other examples of architectures with which the present invention can be used include those developed by other PLD manufacturers such as Xilinx, Inc., also of San Jose, Calif.
Complex logic functions (circuits), as desired, may be implemented in present-day PLDs. The logic functions are implemented by interconnecting a select configuration of device elements according to a suitable circuit design. Conventional circuit design techniques for synthesis of logic functions may be used to generate the suitable circuit design. The circuit design may be characterized by a corresponding configuration file (i.e., a netlist) that specifies the placement of selected device elements and the routing of interconnection between the selected device elements. PLDs usually have a large number of device elements that have identical functionality (e.g., AND gates) and which may be used interchangeably. Therefore, several possible circuit designs (i.e., configurations of device elements) may yield the same desired logic function.
A common measure of circuit performance, data signal propagation delay, may be used to select a particular design for implementation. The data signal propagation delay depends, inter alia, on the length of interconnections and on the number of registers between device elements traversed by data signals. For typical PLDs, most of the data signal propagation delay is due to interconnection delays. Thus, a common figure of merit of PLD circuit delay performance is the length (in units of time) of the longest register-to-register interconnection path (“the critical path”). This critical path also determines the minimum cycle time for a logic step in the PLD circuit. The minimum cycle time is inversely related to the maximum operating frequency of the PLD circuit Fmax. In PLD operation a reference or master clock signal timing various device elements in the PLD is set to have a period or cycle, which is greater than the minimum cycle time.
Automated computer-aided design (CAD) algorithms may be used to implement (place and route) a synthesized circuit design. The CAD algorithms may use conventional physical models to estimate delays along individual interconnection conductors and through individual device elements. However, the optimization of PLD circuit designs with a large number of device elements and interconnections, is a non-trivial task. Estimation of the signal propagation delays across a PLD circuit network may require time-consuming or otherwise expensive computations, and usually can be obtained only after the actual placement and routing of the PLD circuit design is accomplished. If the design does not meet suitable operating frequency design criteria, often alternative designs have to be generated by empirically adding placement constraints and/or re synthesizing the desired logic function in an attempt to find a circuit design with a smaller critical path length.
For digital circuits in general, additional optimization techniques may be used to further reduce an initially-designed critical path length.
An iterative optimization technique, which is commonly called “retiming,” involves repositioning registers along the path of data signals. For example, registers associated with logic cells may be repositioned from the cells' output to input or vice versa, so that the critical path is as short as possible. Repositioning of registers along the data path between the device elements cannot reduce critical path length below the length of the longest interconnection that must be used in a PLD circuit.
Other optimization techniques do not involve repositioning of registers and may be based on temporal considerations. These techniques, which may be used to operate digital circuits at frequencies higher than the designed-for clock frequency, are often referred to as “clock shifting” techniques. Clock shifting involves synchronizing sequential device elements (e.g., registers) that are located at the ends of a data path to operate at different times. Skewed (i.e., phase-shifted) clock signals may be used to sequentially time the device elements. See, for example, John P. Fishburn, “Clock Skew Optimization,” IEEE Trans. Computer, Vol. 39, No. 8, pp. 945-951, July 1990. Clock shifting mitigates the operating frequency-reducing effect of the data signal delay along a long data path by advancing the operation of the data-sending device element and by postponing the operation of the data-receiving device element. The data signal is effectively given more time to propagate on the long data path than on shorter data paths. The resulting operating frequency improvement is limited by the device element timing differences (or amount of clock skew), which can be inserted or introduced without disturbing circuit function. For example, the operation of the data-receiving device element may not be postponed for so long that the input signal data has disappeared. The tolerable amount of inserted clock skew depends on the data signal propagation delay between the device elements, and the delays in the clock signal from the clock source to the device elements, and device parameters such as setup time and hold time, which describe the operating times that are required for proper device functioning (e.g., to register data).
Consideration is now being given to enhancing programming logic device architectures to make them suitable for the application of clock shifting techniques, and to ways of applying clock shifting techniques to improve PLD circuit performance.