For the design of circuits on integrated circuits (ICs), designers often employ computer aided design techniques. Standard languages known as Hardware Description Languages (HDLs) have been developed to describe circuits to aid in the design and simulation of complex circuits. Several HDLs, such as VHDL and Verilog, have evolved as industry standards. VHDL and Verilog are general purpose hardware description languages that allow definition of a hardware model at the gate level, the register transfer level (RTL), or the behavioral level using abstract data types.
In designing circuits using HDL compilers, designers first describe circuit elements in HDL source code and then compile the source code to produce synthesized RTL netlists. The RTL netlists correspond to schematic representations of circuit elements. The circuits containing the synthesized circuit elements are often optimized to improve timing relationships and eliminate unnecessary or redundant logic elements. Such optimization typically involves substituting different gate types or combining and eliminating gates in the circuit. FIG. 1 shows a representative flow for designing certain types of ICs, such as Field Programmable Gate Arrays (FPGAs) which have a predetermined architecture, referred to as a target architecture. Operation 101 involves receiving a description (e.g. a description written in HDL) of an IC by a compiler which, in operation 103, performs a synthesis from the HDL description to an RTL description. In operation 105, the RTL description is mapped to a target architecture, such as the architecture of a Xilinx FPGA, and optimization within the target architecture is performed. After optimization is completed, a netlist for the target architecture is generated. Various methods and systems for computer aided design of ICs are described in U.S. Pat. Nos. 6,438,735; 6,449,762; and 6,973,632, all of which are incorporated herein by reference.
FIG. 2 shows further details regarding a method, in the prior art, for optimizing a design of an IC. In operation 151, the loads which are driven by a particular component are determined, and in operation 153, a most critical load, of those loads, is determined. Then the particular critical component is replicated (operation 155) and the most critical load is connected (operation 157) to the replicated critical component. This will tend to reduce fanout at the source of the wiring net from the original critical component. The load is considered critical if it adversely affects timing constraints or requirements for the IC (or if it has negative slack), and the component is considered critical because it is driving the critical load. The slack of the IC is then recomputed in operation 159 and it is determined whether slack has improved (operation 161). If the slack has not improved, then the replication is discarded (operation 163) and processing returns to operation 151. If the slack has improved, then processing returns from operation 161 to operation 151 as shown in FIG. 2 to optimize other paths having other critical components. FIGS. 3A, 3B, and 3C show an example of the optimization method of FIG. 2.
FIG. 3A shows a representation of at least a portion of an integrated circuit which may be designed according to the method of FIG. 1, with an optimization performed according to the method of FIG. 2. The design at this stage shown in FIG. 3A includes 9 switch matrices (SM) on the representation 201 of the integrated circuit. Switch matrices are common interconnection devices used on certain types of field programmable gate arrays, such as gate arrays from Xilinx. The switch matrices 202-210 allow for the interconnection of various components, such as driver components which output signals to loads which receive those signals. The design shown in FIG. 3A includes one driver 215 and 7 loads, L1-L7. In particular, driver 215 drives loads 216-222 through the routing net shown in FIG. 3A which includes wires W1, W2, and W3. The routing net includes those three wires which are existing wire resources on the IC. The switches on the switch matrix 209 through which the wires are connected are labeled SW1, SW2, and SW3, and the switch at the driver is labeled SWD. The critical loads in the design are L1, L2, L3 and L4 in the order of criticality, most critical being first. The delay of the net on each load depends on the wire delay, the switch delay and the fanout at each switch. The fanout at switch SWD is equal to 3 because there are 3 wires going to switches SW1, SW2, and SW3, which contribute with their capacitances to the wire delay. The total fanout of the net from driver D215 is equal to 7, but the root fanout at the switch SWD is only 3. Further details showing the root fanout at switch SWD is shown in FIG. 3B which shows the driver component 215 providing an input to the switch matrix 209 which is received by a driver 230 which in turn drives 3 pass gates (225, 226 and 227) in the switch SWD as shown in FIG. 3B. Each pass gate has a parasitic capacitance, an example of which is shown as parasitic capacitance 231 on the pass gate 225. These parasitic capacitances add to the delay at the root fanout. It will be appreciated that the driver component 215 may be one of a variety of different logic components, such as a flip-flop, a lookup table or other types of logic, including digital logic circuits.
Previous methods for replication were concentrating at reducing the fanout, particularly the fanout at the root of the net, without paying attention at how the net is wired using existing wiring resources. For example, if we want to reduce the fanout by isolating the critical loads L1 and L2, the driver D can be replicated and the copied driver 215A can drive the rest of the loads. The total fanout of the net driven from the driver 215 will be 2 (down from 7) and the total fanout of the net driven from driver 215A will be 5. This is depicted in FIG. 3C. However, in terms of delay, little will be changed for critical loads L3 and L4 because the root delay of the net driven from driver 215A is still 3 as before replication, and the delay of the switch 254 and the wire W1A which is connecting loads L3 and L4 will be bigger if the faster wire W1 was taken to connect to L1 and L2.
It is desirable to provide improved automated circuit design techniques, including techniques which include improved routing and optimization techniques which are described herein.