Designers of integrated circuit devices (“chips”), generally application-specific integrated circuits (“ASICs”), use prototyping as part of the electronic design automation process prior to manufacture of the chip. Prototyping is one type of hardware-based functional verification that allows the circuit designer to observe the behavior of the circuit design under conditions approximating its final, manufactured performance. During prototyping, a circuit design, generally written in register transfer language (“RTL”) code, is programmed into one or more programmable logic chips, frequently field-programmable gate arrays (“FPGA”) on a prototyping board. FPGA-based prototypes are a fully functional representation of the circuit design, its circuit board, and its input/output (“I/O”) devices. Also, FPGA prototypes generally run at speeds much closer to the clock speed at which the manufactured chip will run than other types of functional verification, e.g., software simulation, thereby allowing for verifying the circuit design under many more conditions in the same amount of time than other verification methods, and in particular, software simulation. The circuit design prototype may also be operated in another electronic circuit, e.g., the electronic circuit for which the design under verification will be used after fabrication, so that the circuit design prototype may be observed and tested in an environment in which the manufactured chip will be used. As such, circuit designers may use FPGA prototyping as a vehicle for software co-development and validation, increasing the speed and accuracy of system developments.
Prototyping of a circuit design using programmable logic chips (e.g., FPGAs) can have advantages over other types of functional verification, namely emulation using a plurality of emulation processors. First, prototyping using programmable logic chips generally results in higher speed relative to emulation using emulation processors. Second, such higher-speed circuit design prototypes using programmable logic chips can sometimes even run in real-time, that is, the prototype may run at the intended clock speed of the manufactured chip, rather than a reduced clock speed. This is not always the case, notably for higher performance circuit designs that have clock speeds higher than the maximum allowed by the programmable logic chips. Third, such prototyping systems using programmable logic chips are generally of lower cost than an emulation system using processors.
Recently, RTL designs used for prototyping have become very large and generally need to be mapped/partitioned to several large FPGAs on a prototyping system. Typically, these large designs employ many clocks (e.g., one to one hundred or more clocks) for the operation of the design. With multiple FPGAs, interconnects are required between the FPGAs for signal flow from one portion of the circuit design logic on a first FPGA to another portion of the circuit design logic on a second FPGA and so forth. However, current FPGAs have a limited number of input/output (I/O) pins and interconnects, which results in overall limited bandwidth for multiple FPGA prototyping systems.
In order to reduce the number of I/O signals across the FPGA partitions to the available bandwidth between FPGAs, prototyping systems typically use time domain multiplexing. For example, suppose the inter-FPGA connectivity between two FPGAs has a bandwidth of 100 (i.e., 100 physical wires), but the circuit design is partitioned across these FPGAs such that 1000 design signals are required to cross over the interconnect. In this scenario, a selector is required to select a group 100 of the 1000 signals to be transmitted at a single time and then repeated 10 times. To do so, a phase generator circuit is programmed that generates 1000/100 or 10 phases, such that each phase selects one of the groups of 100 signals to be transmitted from the first FPGA to the second FPGA at each phase.
FIG. 1 illustrates a conventional pin multiplexing design for multiplexing three design signals to one output pin of an FPGA. As shown, the conventional multiplexing design 100 includes three AND gates 110A, 110B and 110C with the output of each coupled to an inverted input of NAND gate 112. The output of the NAND gate 112 is coupled to an output terminal 114 of the FPGA, which is one I/O pin of the FPGA. Furthermore, each of the AND gates 110A, 110B and 110C has two inputs with the first input being coupled to a phase enable signal and the second input be coupled to a design signal. The design signals are data signals from the user logic of the FPGA that are to be multiplexed on the interconnect line connected to output terminal 114. In particular, P1 corresponds to a first phase enable signal generated by a phase generator circuit, P2 corresponds to a second phase enable signal generated by the phase generator circuit, and P3 corresponds to a third phase enable signal generated by the phase generator circuit. Thus, phase enable signal P1 enables output terminal 114 to transmit design signal I1 over the attached interconnect, phase enable signal P2 enables output terminal 114 to transmit design signal I2 over the attached interconnect, and phase enable signal P3 enables output terminal 114 to transmit design signal I3 over the attached interconnect.
The FPGA pin multiplexing shown in FIG. 1 has numerous problems and limitations. In particular, the phase control lines of the pin-multiplexing logic that drive the I/O lines need to have very low skew (i.e., clock signal delay) in order to guarantee logical correctness of operations and to achieve high performance. However, the I/O pins of current FPGAs are spread far apart such that driving two or more I/O locations with the same driver (i.e., the phase enable output) mapped to local routing resources in each FPGA will lead to high skew. Moreover, the skew of the phase enable outputs will only increase depending on the number of fanout loads on each phase enable output. In addition, the FPGA fabric has a limited set of available low skew global clock lines and most modern FPGA place and route tools do not allow these global clock lines to be programmed to drive combinational logic (i.e., non clock loads) that are usually required for pin multiplexing, such as the AND gates shown in FIG. 1. And even if the FPGAs enable global low skew clock lines to be used to drive combinational logic, there are a very limited number of these lines on current FPGAs and the circuit design will quickly run out of them when used for pin multiplexing.