For the design of circuits on the scale of VLSI (Very Large Scale Integration) technology, designers often employ computer aided design techniques. Standard languages, such as Hardware Description Language (HDLs) have been developed to describe circuits to aid in the design and simulation of complex circuits, such as complex digital or analog circuits. Several hardware design languages, such as VHDL and Verilog, have evolved as industry standards. VHDL and Verilog are general purpose hardware description languages that allow definition of a hardware model at the gate level, the register transfer level (RTL) or the behavioral level using abstract data types. As device technology continues to advance, various product design tools have been developed to adapt HDLs for use with newer devices and design styles.
When designing circuits using HDL compilers, designers often describe circuit elements in HDL source code and then compile the source code to produce synthesized RTL netlists. The RTL netlists correspond to schematic representations of circuit elements. The circuits containing the synthesized circuit elements are often optimized to improve timing relationships and eliminate unnecessary or redundant elements. Such optimization typically involves substituting different gate types or combining and eliminating gates in the circuit. The optimization may be performed before mapping an RTL description, such as an RTL netlist, to a particular predetermined architecture of an integrated circuit, such as a target architecture of a Field Programmable Gate Array (FPGA). As in known in the art, different vendors of FPGAs often have different architectures in that the circuitry within a particular integrated circuit for implementing a function may be different from one vendor to the next. Examples of vendors of FPGAs which have unique architectures include Xilinx and Altera. Various methods and systems for computer aided design of integrated circuits are described in U.S. Pat. Nos. 6,438,735; 6,449,762, and 6,973,632, all of which are incorporated herein by reference.
Often, it is desirable to implement a design in more than one integrated circuit, such as more than one FPGA integrated circuit. This is often necessary when the design is so complex and large that it cannot be fit into the available resources on a single FPGA. This may also be useful for designers of ASICs (Application Specific Integrated Circuits). A growing number of ASIC designers often prototype and test their design by implementing the design in several FPGAs. The complexity of an ASIC design may be such that multiple FPGAs are required to implement all the functions required by the design. This means that the designers must struggle to partition their design into multiple FPGAs, and several tools and methods exist in the prior art for doing so. An example of a technique for partitioning, through an automatic process is described in U.S. Pat. No. 6,438,735, which also describes other prior art techniques for partitioning a design onto multiple FPGAs. An example of a software tool which performs automatic partitioning at the RT component level is the product Certify from Synplicity, Inc. of Sunnyvale, Calif. An example of a tool that performs automatic partitioning at the technology mapped netlist level is the Auspy Partition System II from Auspy Development Inc.
Even with existing techniques for partitioning, it can be difficult to partition a multiplexer (MUX) which can have a large width and potentially have many inputs, one of which is selected to drive the output of the multiplexer. A multiplexer is a well-known circuit element, and an example of a multiplexer is shown in FIG. 1A. FIG. 1A shows a circuit 10 which includes data drivers 12 and 14 which drive clocked data into the inputs 16 and 18 of the multiplexer 20. It will be appreciated that there may be two or more inputs into the multiplexer 20. For example, there may be 20 inputs, each being driven by a data driver into the multiplexer 20. The multiplexer 20 produces an output value at its output 22, and this output value is determined by the data present on the select lines 24 which cause the multiplexer to select one of the inputs for connection to its output 22.
Partitioning a large multiplexer typically requires creating at least three smaller multiplexers. Two of those smaller multiplexers serve to select one input each from two different groups of inputs (to produce two selected inputs) and those two selected inputs are then selected between by the third multiplexer. FIGS. 1B and 1C show an example of such a partition of a large multiplexer into three smaller multiplexers.
In the case of the design 30 shown in FIG. 1B, a multiplexer has been partitioned into three multiplexers 40, 42 and 44 on two different integrated circuits 32 and 34. The multiplexers on the two different integrated circuits are coupled together by a multi-line bus 64 and by a wire 66. Multiplexers 40 and 44, in effect, select between two different groups of inputs to the original multiplexer to thereby provide two outputs which are selected between by the multiplexer 42. Each of the multiplexers is under control of select logic or select lines which cause the selection of an input to be connected to a corresponding output of the multiplexer in a manner known in the art. The select lines are typically driven by decode logic as is known in the art. As shown in FIG. 1B, logic 50 and 56 provide inputs to the multiplexer 40, while logic 54, through bus 62 (which includes a plurality of lines), provides inputs to the multiplexer 44. The multiplexer 44 also receives another input from bus 61 (which includes a plurality of lines). Logic 50 is coupled to the multiplexer 40 through the bus 60, and logic 56 is coupled to the multiplexer 40 through bus 64. The output of multiplexer 40 is coupled to one of the inputs of the multiplexer 42, and the output of the multiplexer 44 is coupled as the other input of the multiplexer 42 which selects between these two inputs to drive its output through line 22 to drive the logic 52. For each of the multiplexers 40, 42, and 44, each data input and each data output is a bus (which includes a plurality of lines), and the switching by the multiplexers switches between the input buses. It will be appreciated that the example shown in FIG. 1B is one of a variety of ways in which the various logic elements and multiplexers may be arranged or partitioned between two or more integrated circuits. However, in each case, the inputs which drive multiplexer 42 must be grouped together relative to the multiplexer 42 in order to preserve the logic implemented by the partitioned multiplexer. In other words, the inputs which drive multiplexer 42 cannot be used to drive multiplexers 40 or 44, but rather must be used to drive the final multiplexer in a tree of multiplexers in order to maintain the proper logical function of the partitioned multiplexer such that it provides the same logical multiplexing as the original multiplexer. FIG. 1C shows this relationship without the complication of multiple integrated circuits. In particular, inputs 70 and 71 must be partitioned with the multiplexer 44. Hence, for example, the drivers for those inputs 70 and 71 must be partitioned with the multiplexer 44. This tends to complicate partitioning operations.
If logic synthesis is performed on a circuit containing a large multiplexor before partitioning into chips, then the resulting circuit will group drivers of the multiplexor according to the multiplexor decomposition chosen by the logic synthesis system. Many different decompositions are possible: trees of smaller multiplexors, and-or decompositions, and special purpose decompositions using FPGA specific components. If the decomposed multiplexor is partitioned across chips, then excessive interconnect use between partitions may be caused by the chosen decomposition. If on the other hand, the partitioning is done before the multiplexor is decomposed, then all signals connected to the multiplexor must be available in a single chip, which can also cause excessive interconnect between partitions. A partial solution is to slice the multiplexor component into single bit width multiplexors. This partial solution is implemented in Certify. This does not help with partitioning multiplexors with large number of inputs and also creates a difficult partitioning problem with the decoding logic for the multiplexor, which will be shared across the slices. What is needed is a simultaneous solution to multiplexor decomposition and partitioning.