When integrated circuits (ICs) were first introduced, they were extremely expensive and were limited in their functionality. Rapid strides in semiconductor technology have vastly reduced the cost while simultaneously increased the performance of IC chips. However, the design, layout, and fabrication process for a dedicated, custom built IC remains quite costly. This is especially true for those instances where only a small quantity of a custom designed IC is to be manufactured. Moreover, the turn-around time (i.e., the time from initial design to a finished product) can frequently be quite lengthy, especially for complex circuit designs. For electronic and computer products, it is critical to be the first to market. Furthermore, for custom ICs, it is rather difficult to effect changes to the initial design. It takes time, effort, and money to make any necessary changes.
In view of the shortcomings associated with custom IC's, field programmable gate arrays (FPGAs) offer an attractive solution in many instances. Basically, FPGAs are standard, high-density, off-the-shelf ICs which can be programmed by the user to a desired configuration. Circuit designers first define the desired logic functions, and the FPGA is programmed to process the input signals accordingly. Thereby, FPGA implementations can be designed, verified, and revised in a quick and efficient manner. Depending on the logic density requirements and production volumes, FPGAs are superior alternatives in terms of cost and time-to-market.
A typical FPGA essentially consists of an outer ring of I/O blocks surrounding an interior matrix of configurable logic blocks. The I/O blocks residing on the periphery of an FPGA are user programmable, such that each block can be programmed independently to be an input or an output and can also be tri-statable. Each logic block typically contains programmable combinatorial logic and storage registers. The combinatorial logic is used to perform boolean functions on its input variables. Often, the registers are loaded directly from a logic block input, or they can be loaded from the combinatorial logic.
Interconnect resources occupy the channels between the rows and columns of the matrix of logic blocks and also between the logic blocks and the I/O blocks. These interconnect resources provide the flexibility to control the interconnection between two designated points on the chip. Usually, a metal network of lines run horizontally and vertically in the rows and columns between the logic blocks. Programmable switches connect the inputs and outputs of the logic blocks and I/O blocks to these metal lines. Crosspoint switches and interchanges at the intersections of rows and columns are used to switch signals from one line to another. Often, long lines are used to run the entire length and/or breadth of the chip.
The functions of the I/O blocks, logic blocks, and their respective interconnections are all programmable. Typically, these functions are controlled by a configuration program stored in an on-chip memory. The configuration program is loaded automatically from an external memory upon power-up, on command, or programmed by a microprocessor as part of system initalizatdon.
The concept of FPGA was summarized in the sixty's by Minnick who described the concept of cell and cellular array as reconfigurable devices in the following documents: Minnick, R. C. and Short, R. A, "Cellular Linear-Input Logic, Final Report," SRI Project 4122, Contract AF 19(628)-498, Stanford Research Institute, Menlo Park, Calif., AFCRL 64-6, DDC No. AD 433802 (February 1964); Minnick, R. C., "Cobweb Cellular Arrays," Proceedings AFIPS 1965 Fall Joint Computer Conference, Vol. 27, Part 1 pp. 327-341 (1965); Minnick, R. C. et al., "Cellular Logic, Final Report," SRI Project 5087, Contract AF 19(628)-4233, Stanford Research Institute, Menlo Park, Calif., AFCRL 66-613, (April 1966); and Minnick, R. C., "A Survey of Microcellar Research," Journal of the Association for Computing Machinery, Vol. 14, No. 2, pp. 203-241 (April 1967). In addition to memory based (e.g., RAM-based, fuse-based, or antifuse-based ) means of enabling interconnects between devices, Minnik also discussed both direct connections between neighboring cells and use of busing as another routing technique. The article by Spandorfer, L. M., "Synthesis of logic Function on an Array of Integrated Circuits," Stanford Research Institute, Menlo Park, Calif., Contract AF 19(628)2907, AFCRL 64-6, DDC No. AD 433802 (November 1965), discussed the use of complementary MOS bi-directional passgate as a means of switching between two interconnect lines that can be programmed through memory means and adjacent neighboring cell interconnections. In Wahlstrom, S. E., "Programmable Logic Arrays--Cheaper by the Millions," Eectronics, Vol. 40, No. 25, 11, pp. 90-95 (December 1967), a RAM-based, reconfigurable logic array of a two-dimensional array of identical cells with both direct connections between adjacent cells and a network of data buses is described.
Shoup, R. G., "Programmable Cellular Logic Arrays," Ph.D. dissertation, Carnegie-Mellon University, Pittsburgh, Pa. (March 1970), discussed programmable cellular logic arrays and reiterates many of the same concepts and terminology of Minnick and recapitulates the array of Wahlstrom. In Shoup's thesis, the concept of neighbor connections extends from the simple 2-input 1-output nearest-neighbor connections to the 8-neighbor 2-way connections. Shoup further described use of bus as part of the interconnection structure to improve the power and flexibility of an array. Buses can be used to route signals over distances too long, or in inconvenient directions, for ordinary neighbor connections. This is particularly useful in passing inputs and outputs from outside the array to interior cells.
U.S. Pat. No. 4,020,469 discussed a programmable logic array that can program, test, and repair itself. U.S. Pat. No. 4,870,302 introduced a coarse grain architecture without use of neighbor direct interconnections where all the programmed connections are through the use of three different sets of buses in a channeled architecture. The coarse grain cell (called a Configurable Logical block or CLB) contains both RAM-based logic table look up combinational logic and flip flops inside the CLB where a user defined logic must be mapped into the functions available inside the CLB. U.S. Pat. No. 4,935,734 introduced a simple logic function cell defined as a NAND, NOR or similar types of simple logic function inside each cell. The interconnection scheme is through direct neighbor and directional bus connections. U.S. Pat. Nos. 4,700,187 and 4,918,440 defined a more complex logic function cell where an Exclusive OR and AND functions and a register bit is available and selectable within the cell. The preferred connection scheme is through direct neighbor connections. Use of bi-direction buses as connections were also included.
Current FPGA technology has a few shortcomings. These problems are embodied by the low level of circuit utilization given the vast number of transistors available on chip provided by the manufacturers. Circuit utilization is influenced by three factors. The first one at the transistor or fine grain cell level is the function and flexibility of the basic logic element that can be readily used by the users. The second one is the ease in which to form meaningful macro logic functions using the first logic elements with minimum waste of circuit area. The last factor is the interconnections of those macro logic functions to implement chip level design efficiently. The fine grained cell architectures such as those described above, provided easily usable and flexible logical functions for designers at the base logic element level.
However, for dense and complex macro functions and chip level routing, the interconnection resources required to connect a large number of signals from output of a cell to the input(s) of other cells can be quickly exhausted, and adding these resources can be very expensive in terms of silicon area. As a consequence, in fine grained architecture design, most of the cells are either left unused due to inaccessibility, or the cells are used as interconnect wires instead of logic. This adds greatly to routing delays in addition to low logic utilization, or excessive amount of routing resources are added, greatly increasing the circuit size. The coarse grain architecture coupled with extensive routing buses allows significant improvements for signals connecting outputs of a CLB to inputs of other CLBs. The utilization at the CLB interconnect level is high. However, the difficulty is the partitioning and mapping of complex logic functions so as to exactly fit into the CLBs. If a part of logic inside the CLB is left unused, then the utilization (effective number of gates per unit area used) inside the CLB can be low.
Another problem with prior art FPGAs is due to the fact that typically a fixed number of inputs and a fixed number of outputs are provided for each logic block. If, by happenstance, all the outputs of a particular logic block is used up, then the rest of that logic block becomes useless.
Therefore, there is a need in prior art FPGAs for a new architecture that will maximize the utilization of an FPGA while minimizing any impact on the die size. The new architecture should provide flexibility in the lowest logic element level in terms of functionality and flexibility of use by users, high density per unit area functionality at the macro level where users can readily form complex logic functions with the base logic elements, and finally high percentage of interconnectability with a hierarchical, uniformly distributed routing network for signals connecting macros and base logic elements at the chip level. Furthermore, the new architecture should provide users with the flexibility of having the number of inputs and outputs for individual logical block be selectable and programmable, and a scalable architecture to accommodate a range of FPGA sizes.