Programmable logic devices (PLDs) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (IOBs), configurable logic blocks (CLBs), dedicated random access memory blocks (BRAM), multipliers, digital signal processing blocks (DSPs), processors, clock managers, delay lock loops (DLLs), and so forth.
Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more “function blocks” connected together and to input/output (I/O) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (PLAS) and Programmable Array Logic (PAL) devices. In CPLDs, configuration data is typically stored on-chip in non-volatile memory. In some CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration sequence.
For all of these programmable logic devices (PLDs), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell. The terms “PLD”, “programmable logic device”, and “programmable integrated circuit” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable. For example, one type of PLD includes a combination of hard-coded transistor logic and a programmable switch fabric that programmably interconnects the hard-coded transistor logic.
As noted above, advanced FPGAs can include several different types of programmable logic blocks in the array. For example, FIG. 1 illustrates an FPGA architecture 100 that includes a large number of different programmable tiles including multi-gigabit transceivers (MGTs 101), configurable logic blocks (CLBs 102), random access memory blocks (BRAMs 103), input/output blocks (IOBs 104), configuration and clocking logic (CONFIG/CLOCKS 105), digital signal processing blocks (DSPs 106), specialized input/output blocks (I/O 107) (e.g., configuration ports and clock ports), and other programmable logic 108 such as digital clock managers, analog-to-digital converters, system monitoring logic, and so forth. Some FPGAs also include dedicated processor blocks (PROC 110).
In some FPGAs, each programmable tile includes a programmable interconnect element (INT 111) having standardized connections to and from a corresponding interconnect element in each adjacent tile. Therefore, the programmable interconnect elements taken together implement the programmable interconnect structure for the illustrated FPGA. The programmable interconnect element (INT 111) also includes the connections to and from the programmable logic element within the same tile, as shown by the examples included at the top of FIG. 1.
For example, a CLB 102 can include a configurable logic element (CLE 112) that can be programmed to implement user logic plus a single programmable interconnect element (INT 111). A BRAM 103 can include a BRAM logic element (BRL 113) in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as four CLBs, but other numbers (e.g., five) can also be used. A DSP tile 106 can include a DSP logic element (DSPL 114) in addition to an appropriate number of programmable interconnect elements. An IOB 104 can include, for example, two instances of an input/output logic element (IOL 115) in addition to one instance of the programmable interconnect element (INT 111). As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 115 are manufactured using metal layered above the various illustrated logic blocks, and typically are not confined to the area of the input/output logic element 115.
In the pictured embodiment, a columnar area near the center of the die (shown shaded in FIG. 1) is used for configuration, clock, and other control logic. Horizontal areas 109 extending from this column are used to distribute the clocks and configuration signals across the breadth of the FPGA.
Some FPGAs utilizing the architecture illustrated in FIG. 1 include additional logic blocks that disrupt the regular columnar structure making up a large part of the FPGA. The additional logic blocks can be programmable blocks and/or dedicated logic. For example, the processor block PROC 110 shown in FIG. 1 spans several columns of CLBs and BRAMs.
Note that FIG. 1 is intended to illustrate only an exemplary FPGA architecture. For example, the numbers of logic blocks in a column, the relative width of the columns, the number and order of columns, the types of logic blocks included in the columns, the relative sizes of the logic blocks, and the interconnect/logic implementations included at the top of FIG. 1 are purely exemplary. For example, in an actual FPGA more than one adjacent column of CLBs is typically included wherever the CLBs appear, to facilitate the efficient implementation of user logic, but the number of adjacent CLB columns varies with the overall size of the FPGA.
The programmable interconnect structure of a typical PLD includes a large number of programmable multiplexers. Each programmable multiplexer selects one of two or more input signals (e.g., from interconnect lines), and passes the selected input signal to a destination (e.g., to another interconnect line).
FIG. 2 illustrates an exemplary programmable multiplexer that can be included, for example, in the programmable interconnect structure of a PLD. Note that programmable multiplexers can include more than two input signals, sometimes many more than two. However, for clarity, FIG. 2 illustrates an exemplary multiplexer having two input signals.
The programmable 2-to-1 multiplexer of FIG. 2 includes several types of transistors, coupled together as shown in FIG. 2. (Note that in some known PLD, programmable multiplexers are implemented with fewer transistor types than those shown in FIG. 2, which is merely exemplary.) Pull-up transistors 202 and 204 are typical P-channel transistors. Pull-down transistor 205 is a typical N-channel transistor. Pull-down transistor 203 is a low threshold voltage (low Vt) N-channel transistor. (Note that low Vt transistor 203 is drawn with a triangle in the gate, indicating the low Vt implementation.) Pull-up transistor 201 is a weak pull-up, i.e., a P-channel transistor having a larger resistance than typical pull-up transistors 202 and 204. Pass transistors 206 and 207 are N-channel transistors having a “mid-ox” oxide thickness larger than the oxide thickness of transistors 201-205, but smaller than the oxide thickness of input/output transistors. (Note that mid-ox transistors 206 and 207 are drawn with two bubbles in the gate, indicating the mid-ox implementation.) The programmable 2-to-1 multiplexer of FIG. 2 also includes two configuration memory cells MA and MB controlling pass transistors 206 and 207, respectively. Memory cells MA and MB drive transistors 206 and 207 at a gate voltage VGG higher than the standard power high voltage VDD, in order to increase the speed of the programmable interconnect.
The programmable 2-to-1 multiplexer of FIG. 2 functions as follows. Transistors 206 and 207 are coupled to form a 2-to-1 multiplexer controlled by values stored in configuration memory cells MA and MB to pass one of input signals INA and INB to node N1. The value on node N1 is then buffered by two inverters coupled in series, with transistors 202 and 203 forming the first inverter, and transistors 204 and 205 forming the second inverter. The second inverter drives the output node OUT, which can be, for example, another interconnect line.
Weak pull-up 201 is included because, as is well known, a high value passed through an N-channel transistor is reduced in voltage by one threshold voltage of the N-channel transistor. Therefore, in the absence of transistor 201, when a high value (VDD) is placed on node N1 from node INA, for example, the voltage at node N1 is the lesser of VDD and VGG−Vt (the “mid-ox” power high minus the threshold voltage of transistor 206). Note that transistors 206 and 207 have a higher threshold voltage than transistor 205, for example, because of their mid-oxide implementation. Therefore, a high voltage passed to node N1 by either of these transistors may be substantially reduced from the value of power high VGG. When node N1 rises to a sufficiently high voltage level, node N2 goes low, and weak pull-up 201 serves to pull node N1 to a fully high value of VDD. Transistor 203 is given a low threshold to compensate for the reduced power high level on node N1, and also to speed up the transition of node N2 from a high value to a low value.
Even including all of these different types of transistors, however, does not overcome an inherent limitation of the circuit of FIG. 2. This limitation is that the delay on a low-to-high transition through the circuit (i.e., signal OUT going from low to high) displays different timing characteristics from a high-to-low transition (i.e., signal OUT going from high to low). This difference is exacerbated, for example, by differences in the semiconductor fabrication process that typically occur during the manufacture of any integrated circuit. For example, a “slow” die (e.g., where transistors are slower than on a typical die) might display more variation between the low-to-high transition speed and the high-to-low transition speed than a “fast” die (e.g., where transistors are faster than on a typical die), because rise delays are typically more sensitive to process variation than fall delays. The difference between high-to-low (falling) transitions and low-to-high (rising) transitions can also vary based on differences in the power supply voltage supplied to two different ICs.
It is desirable to have balanced delays for rising and falling transitions in an integrated circuit (IC), because the operating speed is typically limited by the delays on the slowest paths through the circuit. It is further desirable to provide balanced delays for programmable ICs, because a user circuit can be implemented in many different ways. For example, in a first implementation, a critical path might include a rising transition in one interconnect line segment and a falling transition in another interconnect line segment, and, in another implementation, might route only rising-edge-critical signals through the interconnect structure, and so forth. Therefore, it is desirable to provide structures and methods that can improve the balance between delays for rising and falling transitions in programmable ICs.