Field programmable gate array FPGA architectures can be described as a sea of configurable logic locks connected by universal bi-directional interconnect. FIG. 1 illustrates this generalized concept as an array of configurable logic blocks CLB interconnected with a universal switching fabric 103.
Configurable logic computation capability is combined with a small amount of data memory and a small amount of fixed logic into a configurable logic block CLB. A configurable logic computation unit is typically a look-up table, i.e., a small memory, with data to be computed upon going into address bits, and with each location in the memory providing the output required to complete the truth table specified by a logic function. The logic function is defined during the design process. Memory is programmed once upon initial power up and is static thereafter. Data memory is generally implemented as one or two register bits to store the results of the computation between clock cycles. A general configurable logic block CLB example is shown in FIG. 2. A small amount of logic may also be available to enable data connection between several localized computational units.
Configurable logic blocks CLB are connected through a bi-directional interconnect scheme in which any configurable logic block CLB output can be connected to many different configurable logic block CLB inputs using a series of isolation connectors. FIG. 3 shows how wire segments and isolating connectors are used to provide interconnects unique to an algorithm being implemented. The direction of data flow over segments of the interconnect is according to the digital algorithm being implemented. In general, this data flow is unidirectional. Although it is designed to function in either direction, each interconnect is programmed and used in only one direction at a time.
A specific interconnect is performed at power up, with all the configurable logic block CLB inputs and outputs specified through the interconnect. After power up programming is completed, the interconnect is static until the next power-up cycle. The universal nature of the interconnect cannot be changed to optimize differences in interconnect requirements for various parts of the digital algorithm.
A specified set of logic functions within the configurable logic blocks CLB combined with a specified interconnect allows the field programmable gate array FPGA to compute virtually any digital logic function that can fit within the boundaries of the array.
The foregoing approach has several drawbacks including the following:                Configurable logic blocks CLBs and universal interconnect of FPGAs cannot be changed during execution. They are static. While offline, in-system reconfigurability can occur, but requires many clock cycles and occurs in a manner similar to device programming.        The universal interconnect is designed for bi-directional traffic but is used uni-directionally. As fabrication technology increases the dependence of FPGA performance on interconnect, the universal interconnect strategy becomes increasingly inefficient and dominates performance.        The universal interconnect strategy does not adapt to optimize the local and global nature of interconnect in the algorithm under consideration.        The structures needed to program the array take up a large amount of silicon, increasing the cost of the device.        Significant numbers of registers are unused. Their distributed nature makes them unavailable to other parts of the digital algorithm.        The distribution of registers requires data to flow to physically different areas of the FPGA to execute digital algorithms.        The clocking rate of the FPGA computation is determined by the implementation of the digital algorithm.        
Although reprogrammability has made the FPGA a powerful solution for some applications, FPGAs remains unsuitable for many applications. Because of the foregoing drawbacks, FPGA are unsuitable for use as dynamic reconfigurable computing structures.
Most digital algorithms are implemented in hardware using a combination of three elements: combinatorial gates to perform boolean logic; registers to store boolean logic; and interconnect to provide boolean connections between the gates and registers.
FIG. 4 shows an organization of these elements that can compute portions of a complete digital algorithm. In this example, two sets of combinatorial computation are placed between three register sets. When combined together, the sets form the complete computation of a digital algorithm. During each cycle, the boolean logic gates are used to further data computation and the registers are used to store data for use during later computation cycles. Inputs and outputs are also shown to enter and leave the combinatorial gate set. As inputs and register values change, unique computations are performed each cycle. A digital algorithm of any size can be computed using a combination of such structures.
Examining this gate-level structure, we have observed that: data flows in one direction between register stages; a significant amount of logic can occur between register stages; and boolean gates provide the capability for a low level of design implementation. In an FPGA, configurable logic blocks CLB provide these capabilities.
We have also observed that in this gate-level structure that only a small subset of gate outputs are registered and that, in an FPGA, it is the interconnect that provides this capability.