Programmable logic devices (PLDs) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (IOBs), configurable logic blocks (CLBs), dedicated random access memory blocks (BRAM), multipliers, digital signal processing blocks (DSPs), processors, clock managers, delay lock loops (DLLs), and so forth.
Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more “function blocks” connected together and to input/output (I/O) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (PLAs) and Programmable Array Logic (PAL) devices. In CPLDs, configuration data is typically stored on-chip in non-volatile memory. In some CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration sequence.
For all of these programmable logic devices (PLDs), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell.
Other PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be implemented in other ways, e.g., using fuse or antifuse technology. The terms “PLD” and “programmable logic device” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable.
A PLD interconnect structure can be complex and highly flexible. For example, Young et al. describe the interconnect structure of an exemplary FPGA in U.S. Pat. No. 5,914,616, issued Jun. 22, 1999 and entitled “FPGA Repeatable Interconnect Structure with Hierarchical Interconnect Lines”, which is incorporated herein by reference in its entirety.
Programmable interconnect points (PIPs) are often coupled into groups that implement multiplexer circuits selecting one of several interconnect lines to provide a signal to a destination interconnect line or logic block. A routing multiplexer can be implemented, for example, as shown in FIG. 1. The illustrated circuit selects one of several different input signals and passes the selected signal to an output terminal. Note that FIG. 1 illustrates a routing multiplexer with eight inputs, but PLD routing multiplexers typically have many more inputs, e.g., 28, 30, or 32. However, FIG. 1 illustrates a smaller circuit, for clarity.
The routing multiplexer of FIG. 1 includes eight input terminals IN0-IN7 and ten pass gates 100-109. Pass gates 100-103 selectively pass input signals IN0-IN3, respectively, to a first internal node INT1. (Note that In the present specification, the same reference characters are used to refer to terminals, nodes, signal lines, and their corresponding signals.) Each pass gate 100-103 has a gate terminal driven by a configuration memory cell M12-M15, respectively. Similarly, pass gates 104-107 selectively pass input signals IN4-1N7, respectively, to a second internal node INT2. Each pass gate 104-107 has a gate terminal driven by one of the same configuration memory cells M12-M15, respectively. From internal nodes INT1, INT2, pass gates 108, 109 are controlled by configuration memory cells M10, M11, respectively, to selectively pass at most one signal to a third internal node INT3.
The signal on internal node INT3 is buffered by buffer BUF to provide output signal ROUT. Buffer BUF includes two inverters 111, 112 coupled in series, and a pullup (e.g., a P-channel transistor 113 to power high VDD) on internal node INT3 and driven by the node between the two inverters.
Values stored in configuration memory cells M10-M15 select at most one of the input signals IN0-IN7 to be passed to internal node INT3, and hence to output node ROUT. If none of the input signals is selected, output signal ROUT is held at its initial high value by pullup 113. Pullup 113 also pulls signal INT3 fully to power high VDD to fully shut off the pullup of inverter 111.
Clearly, a circuit implemented in flexible programmable logic can potentially be slower than circuitry implemented using dedicated logic (i.e., logic designed for a specific purpose). For example, a circuit implemented using programmable lookup tables (LUTs) and flip-flops might need to traverse a succession of LUTs and interconnections between each pair of successive flip-flops, as shown in FIG. 2. The exemplary signal path illustrated in FIG. 2 connects an output terminal of flip-flop 201 with an input terminal of flip-flop 209, and sequentially traverses interconnect 202, LUT 203, interconnect 204, LUT 205, interconnect 206, interconnect 207, and LUT 208. The path delay includes one clock-to-out delay for flip-flop 201, four interconnect delays, three LUT delays, and one setup time for flip-flop 209. The total of these delays determines the minimum clock period for the illustrated signal path.
In non-programmable circuits, one known method of increasing circuit performance is the use of dynamic logic. In dynamic circuitry, many or all nodes (e.g., all output nodes) are pre-charged to a first known value. This state is referred to herein as the “pre-charge state”. At a later time the circuit enters the “evaluation state”, in which the pre-charge is released and some of the pre-charged nodes change to a second known value, as determined by the logic. In clocked dynamic logic, for example, all nodes can be pulled high at a falling edge of a clock, and then some of the nodes are selectively pulled low at the rising edge of the clock. Therefore, whenever the clock is low the circuit is in the pre-charge state, and whenever the clock is high the circuit is in the evaluation state. (Clearly, dynamic circuits also can be designed to operate in the opposite fashion, i.e., to be in the pre-charge state whenever the clock is high, and in the evaluation state whenever the clock is low.) Thus, only the falling edge on the pre-charged nodes is speed-critical, and circuitry can be skewed for a fast falling edge and a slow rising edge on these nodes.
One of the drawbacks of clocked dynamic logic is that distributing the heavily loaded clock signal throughout the dynamic logic consumes a lot of power. Another type of known dynamic logic avoids using a clock signal by utilizing a self-resetting technique, in which the output node is pre-charged during the pre-charge state, then is conditionally discharged (evaluated) whenever an input node of the circuit changes state. Thus, a low pulse might or might not appear at the output node, based on the values of the various input signals.
As is clear from the example illustrated in FIG. 2, interconnect delays can constitute a significant portion of the delay on a critical path in a PLD design. Therefore, it is desirable to provides circuits that facilitate the reduction of interconnect delays in a PLD. It is further desirable to provide circuits that facilitate the reduction of interconnect delays in other types of integrated circuits (ICs), including non-programmable ICs.