When integrated circuits (ICs) were first introduced, they were extremely expensive and were limited in their functionality. Rapid strides in semiconductor technology have vastly reduced the cost while simultaneously increasing the performance of IC chips. However, the design, layout, and fabrication process for a dedicated, custom built IC remains quite costly. This is especially true for those instances where only a small quantity of a custom designed IC is to be manufactured. Moreover, the turn-around time (i.e., the time from initial design to a finished product) can frequently be quite lengthy, especially for complex circuit designs. For electronic and computer products, it is critical to be the first to market. Furthermore, for custom ICs, it is rather difficult to effect changes to the initial design. It takes time, effort, and money to make any necessary changes.
In view of the shortcomings associated with custom IC's, Field Programmable Gate Arrays (FPGA) offer an attractive solution in many instances. Basically, FPGAs are standard, high-density, off-the-shelf ICs, which can be programmed by the user to a desired configuration. Circuit designers first define the desired logic functions, and the FPGA is programmed to process the input signals accordingly. Thereby, FPGA implementations can be designed, verified, and revised in a quick and efficient manner. Depending on the logic density requirements and production volumes, FPGAs are superior alternatives in terms of cost and time-to-market.
A typical FPGA essentially consists of an outer ring of I/O blocks surrounding an interior matrix of configurable logic blocks. The I/O blocks residing on the periphery of an FPGA are user programmable, such that each block can be programmed independently to be an input or an output and can also be tri-statable. Each logic block typically contains programmable combinatorial logic and storage registers. The combinatorial logic is used to perform Boolean functions on its input variables. Often, the registers are loaded directly from a logic block input, or they can be loaded from the combinatorial logic.
Interconnect resources occupy the channels between the rows and columns of the matrix of logic blocks and also between the logic blocks and the I/O blocks. These interconnect resources provide the flexibility to control the interconnection between two designated points on the chip. Usually, a metal network of lines runs horizontally and vertically in the rows and columns between the logic blocks. Programmable switches connect the inputs and outputs of the logic blocks and I/O blocks to these metal lines (called input & output connection boxes). Crosspoint switches and interchanges at the intersections of rows and columns are used to switch signals from one line to another (called switch boxes). Often, long lines are used to run the entire length and/or breadth of the chip.
The functions of the I/O blocks, logic blocks, and their respective interconnections are all programmable. Typically, a configuration program stored in an on-chip memory controls these functions. The configuration program is loaded automatically from an external memory upon power-up, on command, or programmed by a microprocessor as part of system initialization.
A typical FPGA architecture is shown in the FIG. 1. The configurable logic block shown in the figure has its inputs connected to the routing fabric via the connection boxes (C-Box). The switch box (S-Box) can be of different topologies namely Wilton, Disjoint or Hyper Universal, which provide enhanced routability at the expense of some extra resources.
In recent trends, the connection boxes of a logic cluster are no longer concentrated on the four adjacent channels but on all four sides of a particular switch box making connection box and switch box appear as one single entity as shown in FIG. 2.
A typical configurable logic block (CLB) would be as shown in FIG. 3. The logic block shown has a full matrix on the input side of its connectivity with the routing fabric, and internal feedback matrix for merged nets. It could also possibly have a full matrix on the output side to connect to the routing fabric. For generic FPGA structures, the papers by Vaughn Betz, “Architecture and CAD for Speed and Area Optimization of FPGAs,” Phd thesis, University of Toronto, 1998 and J. Rose, R. J. Francis, D. Lewis, and P. Chow, “Architecture of Field-Programmable Gate Arrays: The Effect of Logic Block Functionality on Area of Efficiency,” IEEE Journal of Solid-State Circuits, Vol. 25 No. 5, October 1990, pp. 1217-1225 can be referred.
In recent trends the designers have deviated from using full crossbars as they require large buffers and instead use depopulated matrices as in FIG. 4. The inputs of a look-up table (LUT) are identical and swappable by changing the configuration bits to implement the same logic. Utilizing this fact, and also that duplication in LUT inputs is unnecessary; a smaller input matrix as shown in FIG. 4 can replace the full matrix. The feedback matrix has been omitted for simplicity. Here the first 4.times.4 matrix serves the first inputs of all the four LUTs and the second matrix serves the second inputs and so on. Thus we see that the inputs of the logic block have been split into four domains; i.e., the inputs which drive the first inputs belong to one domain; those which drive the second inputs belong to the second domain and so on.
The disjoint switch box is very popular because of its ease during layout. A disjoint switch box is shown in FIG. 5. A disjoint switch box has similar one to one connections on all the sides. A signal on a particular track remains on the same track throughout the fabric. So if the logic block (CLB) in FIG. 6 is connected to a routing fabric with such a switch box via identical connection boxes on all the sides, an implied segregation of routing resources into domains is achieved as those tracks which connect to pins of a particular domain belong to that domain assuming that they do connect to one pin in one connection box. As shown in the figure, a total of sixteen input pins of a logic cluster form four different domains. The tracks connecting to these pins are accordingly demarcated. Further in case the routing domains are not segregated into domains, a routing line on one side gets connected to its corresponding routing line on the other side as shown in FIG. 7.
These configurations have greatly reduced routability. If a signal is routed on input line 705 to the Logic Block on a particular domain, then only limited tracks would be available via which the signal can be routed. In case domains are not available, the signal would be routed to its corresponding lines on the other sides. Specifically, only the routing tracks of same domain would be available as shown in FIGS. 6 and 7. A set of four segments from each side that form a part of the disjoint switch box, e.g., 701, 702, 703, 704, belong to the same domain. This arises from the fact that all four-connection boxes of a logic cluster are the same. Thus, there is a constraint on the connectivity of the routing tracks to other routing tracks.
Furthermore, if a net has sinks in more than one domain it has to duplicate the net onto routing fabric from the source itself. This increases the demand on routing tracks.