1. Field of the Invention
This invention relates generally to a programmable logic device and, in particular, to a programmable logic device having versatile and efficient logic elements and logic array blocks.
2. Description of the Related Art
A programmable logic device (“PLD”) is a digital, user-configurable integrated circuit used to implement a custom logic function. For the purposes of this description, the term PLD encompasses any digital logic circuit configured by the end-user, and includes a programmable logic array (“PLA”), a field programmable gate array (“FPGA”), and an erasable and complex PLD. The basic building block of a PLD is a logic element (“LE”) that is capable of performing logic functions on a number of input variables. Conventional PLDs combine together large numbers of such LEs through an array of programmable interconnects to facilitate implementation of complex logic functions. PLDs have found particularly wide application as a result of their combined low up front cost and versatility to the user.
A variety of PLD architectural approaches arranging the interconnect array and LEs have been developed to optimize logic density and signal routability between the various LEs. The LEs are arranged in groups to form a larger logic array block (“LAB”). Multiple LABs are arranged in a two dimensional array and are programmably connectable to each other and to the external input/output pins of each LAB through horizontal and vertical interconnect channels.
The typical LAB within the PLD includes a set of LEs, routing lines, and multiplexers to provide inputs to the LEs and route outputs from the LEs to routing lines both within the LAB and outside the LAB. One type of routing lines are LAB lines which are within the LAB and are driven by a set of multiplexers that select from routing signals outside the LAB. Another type of routing lines are local lines which are within the LAB and carry signals that are generated by LEs within the LAB. A set of LE input multiplexers (“LEIMs”) within the LAB programmably select signals from any one of the LAB lines or local lines. Each LE has associated with it one LEIM per input to the LE. In this case, referred to as a fully populated LAB, the LEIMs can programmable select a signal from all of the LAB lines and local lines. In another implementation, each LE has LEIMs divided into two groups. One group of LEIMs selects from one pool of LAB lines and local lines, and the second group of LEIMs selects from another pool of LAB lines and local lines. In the fully populated LAB, the large number of inputs to the LEIM results in a large size multiplexer which results in a PLD that requires more area and is slower.
An alternative to the fully populated LAB is a partially populated LAB. In the partially populated LAB, each LEIM has access to a subset of the LAB lines and local lines. However, this pattern of connections is constructed in a repeating form, such that the LAB lines may be divided into a small number of disjoint groups, with each group providing access to a specific subset of the pins on all LEs. For example, assuming that there are four LAB lines and each LE has four input pins labeled A, B, C, and D, a first group of half the LAB lines connect to input pins A and C on every LE, and a second group of the other half of the LAB lines connect to input pins B and D on every LE. This regular pattern facilitates implementation of the LEIMs, but at a cost in decreased routability. Routing signals that fan out to multiple LEs within a single LAB may result in contention for the input pins of the LEs and more LAB lines will have to be provided than that used with the fully populated LAB. Elaborating on the previous example to show contention, it may be desired to send a signal on a LAB line to pin A on one LE and to pin B on another LE. Since none of the LAB lines connect to both pins A and B (in this example, the LAB lines connect to pins A and C or pins B and D), two LAB lines are used in this case rather than a single LAB line. The greater the number of LAB lines used, the larger the size of the PLD and the greater the delay within the. PLD. Increasing the number of LAB lines used also results in increased PLD cost.
In a fully populated LAB, each of the LEIMs provides programmable connections to all of the LAB lines and local lines within the pool resulting in the large number of inputs to the LEIM. With the partially populated LAB, the cost of the large number of inputs is somewhat reduced, but this reduction is mitigated by the need to increase the number of LAB lines and associated routing circuitry.
Each LE typically provides a combinational logic function such as a look-up table (“LUT”), and one or more flip-flops. The input of the flip-flop may programmably be selected to be either the output of the LUT, or one of the input pins of the LE. Other multiplexing circuits may exist to dynamically select between the output of the LUT and one of the inputs of the LE using other logic signals. For example, the APEX-20K can programmably be configured to load the flip-flops from the C input of the LE, or programmably be configured to select between one of the LE output, the LE input, and a ground signal under the dynamic control of the two signals “synchronous load” and “synchronous clear” which are distributed to all of the LEs in the LAB.
Each LE can programmably select the output of the LUT, which is the combinational output, or the output of the flip-flop, which is the registered output, as one of the outputs of the LE. One or more of these outputs will be driven onto the routing structures (e.g., driver input multiplexers (“DIMs”) and drivers that drive the wires of a channel) outside the LABs. One or more of these outputs will also be driven onto the local lines of the LAB. For example, with the APEX-20K, the output may programmably be driven onto two distinct sets of local lines.
The multiplexers typically within the LE allow the LE to be programmably configured to perform a variety of useful functions. The LE may be configured to perform a combinational function in isolation. It may alternatively be configured to perform a combinational function feeding a flip-flop, and route either or both of the combinational and registered signal to the outputs. It may also be programmably configured to implement both a combinational function and an independent flip-flop, or a flip-flop that shares as its data input one of the inputs to the combinational function, or as a flip-flop in isolation. Finally, it may be programmably configured to select between the various data sources (combinational function, LE input, or logic 0) based on certain control signals.
The necessity of adding a multiplexer to select between the LUT and flip-flop adds delay to the circuit. This delay should be minimized to improve LE performance especially when the multiplexer is used within the critical path.
Current LEs provide the ability to use the flip-flop and the LUT as separate logic units within the LE, however, these are not completely independent. If the flip-flop has its input connected to signals that are distinct from any of those used by the LUT, then it uses one of the input connections to the LUT, reducing the number of connections available for the LUT. Similarly, if both the output from the LUT and the output from the flip-flop are used within a LE, there is only one local line connection that is available to route a signal from an output of that LE to inputs of other LEs within the LAB, so if both the output from the LUT and the output from the flip-flop need to drive an input of an LE within the LAB, then either the output of the flip-flop or the output of the LUT is routed outside the LAB to one of the LAB lines at a higher cost and logic delay. In addition, a LUT and a flip-flop may be merged (i.e., both included within a single LE) into a single LE only if the LUT output feeds the input of the flip-flop, or one of the inputs of the LUT is not used, or the signal driving the flip-flop is also connected to one of the LUT inputs.
In some LE architectures, a LUT having four inputs is implemented using two LUTs having three inputs of A, B, and carry_in. In these architectures, an arithmetic function of more than two data inputs (e.g., the data inputs “A” and “B”) cannot be performed. For example, functions such as performing the addition or subtraction of the two data inputs under the control of another input cannot be performed.
For the foregoing reasons, it is desirable to have a PLD that includes versatile and efficient LEs and logic array blocks.