Programmable logic devices (PLDs) typically include a plurality of logic elements and associated interconnect resources that are programmed by a user to implement user-defined logic operations (e.g., an application specific circuit design). A PLD is typically programmed using programming software that is provided by the PLD's manufacturer, a personal computer or workstation capable of running the programming software, and a device programmer. In contrast, application specific integrated circuits (ASICs) have fixed-function logic circuits and fixed signal routing paths, and require a protracted layout process and an expensive fabrication process to implement a user's logic operation. Because PLDs can be utilized to implement logic operations in a relatively quick and inexpensive manner, PLDs are often preferred over ASICs for many applications.
FIG. 1(A) shows an example of a field programmable gate array (FPGA) 100, which is one type of PLD. Although greatly simplified, FPGA 100 is generally consistent with XC3000.TM. series FPGAs, which are produced by Xilinx, Inc. of San Jose, Calif.
FPGA 100 includes an array of configurable logic blocks CLB(1,1) through CLB(4,4) surrounded by IOBs IOB-1 through IOB-16, and programmable interconnect resources that include vertical interconnect segments 120 and horizontal interconnect segments 121 respectively extending between the rows and columns of CLBs and IOBs.
Each CLB includes configurable combinational circuitry and optional output registers that are programmed to implement logic in accordance with CLB configuration data stored in memory cells (not shown). Data is transmitted into each CLB on input wires 110 and is transmitted from each CLB on output wires 115. The configurable combinational circuitry of each CLB implement a portion of a logic operation responsive to signals received on input wires 110 in accordance with the CLB configuration data stored in that CLB's memory cells. Similarly, the optional output registers of each CLB transmit signals from the CLB onto a selected output wire 115 in accordance with the stored CLB configuration data. As discussed in further detail below, CLB configuration data for each CLB is generated by place-and-route software during a programming process. Typically, all of the CLBs of an FPGA include identical configurable circuitry, so configuration data stored in the memory cells of one CLB can be transferred to the memory cells of another CLB to allow "movement" of logic portions represented by the configuration data between the CLBs.
Each IOB includes configurable circuitry that is controlled by associated memory cells, which are programmed to store IOB configuration data during the place-and-route process. In accordance with the IOB configuration data, each IOB selectively allows an associated pin (not shown) of FPGA 100 to be used either for receiving input signals from other devices, or for transmitting output signals to other devices. Similar to the CLBs, all of the IOBs of an FPGA typically include identical configurable circuitry so that IOB configuration data can be easily transferred from one IOB to another IOB.
The programmable interconnect resources of FPGA 100 are configured using various switches to generate signal paths on which input and output signals are passed between the CLBs and IOBs to implement logic operations. These various switches include segment-to-segment switches, segment-to-CLB/IOB input switches, and CLB/IOB-to-segment output switches. Segment-to-segment switches include configurable circuitry that selectively couple wiring segments to form signal paths. Segment-to-CLB/IOB input switches include configurable circuitry that selectively connects the input wire 110 of a CLB (or IOB) to the end of a signal path. CLB/IOB-to-segment output switches include configurable circuitry that selectively connects the output wire 115 of a CLB (or IOB) to the beginning of a signal path.
FIG. 1(B) shows an example of a 6-way segment-to-segment switch 122 that selectively connects vertical wiring segments 121(1) and 121(2), and horizontal wiring segments 120(1) and 120(2), in accordance with 6-way switch configuration data stored in memory cells M1 through M6. 6-way switch 122 includes normally-open pass transistors that are turned on to provide a signal path (or branch) between any two (or more) of the wiring segments in accordance with the 6-way switch configuration data. For example, a signal path is provided between vertical wiring segment 121(1) and vertical wiring segment 121(2) by programming memory cell M5 to turn on its associated pass transistor. Similarly, a signal path is provided between vertical wiring segment 121(1) and horizontal wiring segment 120(2) by programming memory cell M1 to turn on its associated pass transistor. Similar signal paths between any two (or more) wiring segments are provided by selecting the relevant memory cell (or memory cells). Typically, the six-way configuration data stored in memory cells M1 through M6 is generated by a routing portion of the place-and-route software implemented during the programming process.
FIG. 1(C) shows an example of a segment-to-CLB/IOB input switch 123 that selectively connects an input wire 110(1) of a CLB (or IOB) to one or more interconnect wiring segments in accordance with input switch configuration data stored in memory cells M7 and M8. Segment-to-CLB/IOB input switch 123 includes a multiplexer (MUX) having inputs connected to horizontal wiring segments 120(3) through 120(5) through buffers, and an output that is connected to CLB input wire 110(1). Memory cells M7 and M8 transmit control signals on select lines of the MUX such that the MUX passes a signal from one of the wiring segments 120(3) through 120(5) to the associated CLB (or IOB). As with the 6-way switch configuration data, the input switch configuration data is generated by a routing portion of the place-and-route software implemented during the programming process.
FIG. 1(D) shows an example of a CLB/IOB-to-segment output switch 123 that selectively connects an output wire 115(1) of a CLB (or IOB) to one or more interconnect wiring segments in accordance with input switch configuration data stored in memory cells M9 through M11. CLB/IOB-to-segment output switch 124 includes three pass transistors connected between output wire 115(1) and vertical wiring segments 120(3) through 120(5), and gates that are connected memory cells M9 through M11. Memory cells M9 through M11 store output switch configuration data that turns on selected pass transistors to pass output signals from the CLB (or IOB) to one or more of wiring segments 120(3) through 120(5). As with other switch configuration data, the output switch configuration data is generated by a routing portion of the place-and-route software implemented during the programming process.
PLDs are typically programmed using configuration data generated during a PLD programming process. PLD programming processes are in part determined by the particular configurable circuit structure of the target PLD to be programmed. The PLD programming processes discussed below are specifically directed to FPGAs incorporating the configurable circuit structure described above.
FIG. 2(A) shows a block diagram generally illustrating the system utilized to generate configuration data for a target FPGA 100 such that the target FPGA implements a user's logic operation. The system includes a computer 200 and a programmer 210. The computer 200 has a memory 201 for storing software tools referred to herein as place-and-route software 202 and device programming software 203. The memory 201 also stores a computer-readable FPGA description 204, which identifies all of the interconnect resources, IOBs and CLBs of the target FPGA 100, including all of the memory cells associated with each of these elements.
The FPGA programming process typically begins when the user 230 enters the logic operation 205 into memory 201 via an input device 220 (for example, a keyboard, mouse or a floppy disk drive). The user then instructs the place-and-route software 202 to generate configuration data (place-and-route solution) which, when entered into the target FPGA device 100, programs the FPGA device 100 to implement the logic operation. The place-and-route software begins this process by accessing the FPGA description 204 and the logic operation 205. The place-and-route software 202 then divides the logic operation 205 into inter-related logic portions that can be implemented in the individual CLBs of the target FPGA 100 based on the description provided in the FPGA description 204. The place-and-route software 202 then assigns the logic portions to specific CLBs of the FPGA description 204. Routing data is then generated by identifying specific interconnect resources of the FPGA description 204 that can be linked to form the necessary signal paths between the inter-related logic portions assigned to the CLBs in a manner consistent with the logic operation 205. The placement and routing data is then combined to form the configuration data that is stored in a configuration data table 206.
FIG. 2(B) is a graphical representation of the configuration data table 206 that is part of memory 201. Configuration data table 206 includes several groups of configuration data associated with the various configurable circuits of the target PLD, such as the CLBs, IOBs, 6-way switches 122, input switches 123 and output switches 124. For example, in the graphical representation shown in FIG. 2(B), each column includes data for a particular element type, and the values (1 or 0) assigned to the associated memory cells of that element. For example, the CLB column includes CLB 1,1 (located in the upper left corner of FPGA 100 in FIG. 1) and values (VAL) assigned to associated memory cells M(a1), M(a2) . . . Similarly, the I/O column includes IOB-1 and the values assigned to associated memory cells M(b1), M(b2) . . . The memory cells associated with each 6-way switches 122, input switches 123 and output switches 124 are identified by their assigned address, and the values assigned to their memory cells are stored in a similar manner.
If the place-and-route software 202 fails to generate, for example, a routing solution for a particular target PLD (in one case, and hereinafter FPGA) 100, then the user may repeat the place-and-route operation using a different FPGA (e.g., one having more interconnect resources). Place-and-route operations and associated software are well known by those in the FPGA field.
After the place-and-route software 202 generates configuration data including memory states for all of the memory cells of target FPGA 100, the contents of configuration data table 206 are loaded into a PROM 150 (or directly into the target FPGA 100) via the programming software 203 and the programmer 210. Configuration data is loaded into PROM 150 when, for example, the target FPGA 100 includes volatile memory cells. In this case, PROM 150 is connected to special configuration pins of the target FPGA 100. When power is subsequently applied to the target FPGA 100, configuration logic provided on FPGA 100 loads the configuration data from PROM 150 into its volatile memory cells, thereby configuring FPGA 100 to implement the logic operation.
Core-based PLD programming methods were developed to shorten the PLD programming process by allowing logic designers to represent large portions of their logic operations with "black box" symbols referred to herein as "cores". Relatively simple logic operations are typically easily generated at a logic gate level by logic design personnel using, for example, CAD-based circuit design software, and then placed and routed in a relatively short amount of time using basic PLD programming methods. Conversely, the generation of relatively complex logic operations at a gate level is time consuming and tedious. To reduce the burden on logic designers, core-based programming includes a library of "cores" (sometimes referred to as "macros") that are associated with large, commonly-used logic functions, such as counters. Cores are selected from the library and incorporated into complex logic operations as single elements. Each core includes predefined configuration data that is typically implemented in several CLBs and associated interconnect resources to generate the particular logic function that is defined by that core. Therefore, core-based PLD programming reduces the time required to generate a complex logic operation by allowing a circuit designer to implement a large, commonly-used logic function of their large logic operation using a single core selected from the core library.
Core libraries are typically included in the PLD programming software provided by the manufacturer of a PLD. Examples of circuit cores available in the XACTTM programming software available from Xilinx, Inc. that are provided for programming XC4000TM series FPGAs include FIR filters, 16-bit multipliers, Fast Fourier Transforms (FFTs) and large adders. These cores are often represented by configuration data that requires the use of several CLBs arranged in a particular pattern on a target FPGA.
FIGS. 3(A) and 3(B) are diagrams illustrating examples of two hypothetical cores. The cores shown in FIGS. 3(A) and 3(B) are greatly simplified for explanatory purposes, and are not typically implemented in the manner described in this example. Further, also for explanatory purposes, each CLB is assumed to include configurable logic that can implement only one logic gate. Typically, actual cores are include significantly more logic than that shown in these examples, and actual CLBs include configurable logic that can implement more than one logic gate.
FIG. 3(A) shows a first example of a core 310 implementing an RS flip flop made up of four NAND gates. Core 310 includes CLB configuration data for memory cells associated with four CLBs such that each CLB implements one two-input NAND gate. In addition, core 310 includes switch configuration data for memory cells associated with the interconnect resources required to provide signal paths for input signals S, R and CP, output signals Q and Q', and the intermediate signal paths between the CLBs. Typically, a logic designer cannot alter the configuration data associated with core 310 from its original CLB assignment pattern. For example, core 310 requires four CLBs arranged in the two-by-two CLB assignment pattern shown in FIG. 3(A). When core 310 is implemented on a target FPGA, the place-and-route software places core 310 into four CLBs of the target FPGA that are arranged in the two-by-two pattern shown in FIG. 3(A).
FIG. 3(B) shows another example of a core 320 for a four-input multiplexer that is implemented by four three-input AND gates, two inverters and one four-input OR gate. Core 320 includes the CLB configuration data associated with the memory states of the seven CLBs required to implement the logic gates. In addition, core 320 includes switch configuration data associated with the memory cells of the interconnect resources required to provide signal paths for input signals I0, I1, I2, I3, S0 and S1, output signal Y, and the intermediate signal paths between the CLBs. Similar to core 310 (see FIG. 3(A)), eight CLBs arranged in the two-by-four CLB assignment pattern shown in FIG. 3(B) are required to implement core 320 (i.e., the one unused CLB is typically not available for implementing other logic).
Although core-based PLD programming methods provide a more convenient vehicle for circuit designers to generate complex logic operations, they introduce certain constraints that complicate the place-and-route methods typically utilized during PLD programming. These complications are described in the following paragraphs.
FIGS. 4(A) and 4(B) are graphical representations of placement arrangements showing five cores A, B, C, D and E placed on a target FPGA 400. FPGA 400 has 100 CLBs 410 arranged in 10 rows and 10 columns. The placement of core A is indicated by the dashed-line box located around the top seven rows in the right-most three columns of FPGA 400, and by the shaded CLBs located within the dashed box. Cores B. C, D and E are similarly identified.
FIGS. 4(A) and 4(B) employ graphical representations that are intended to illustrate the provisional placement and subsequent relocation (movement) of logic portions relative to CLB sites during the place-and-route portion of a PLD programming process in accordance with known methods. That is, although FIGS. 4(A) and 4(B) may appear to indicate that cores A, B, C, D and E are placed in CLBs of an actual FPGA, these figures are mere graphical representations of the provisional placement arrangements defined in configuration data table 206 (see FIG. 2(B)). Programming of an actual FPGA 400 takes place after the place-and-route software has determined an acceptable placement and routing solution.
In the graphical representations used herein, the "movement" of cores relative to the CLB sites indicates that configuration data associated with the logic portions assigned to a first group of CLB sites has been reassigned to a second group of CLB sites in configuration data table 206. For example, core A is moved from a first group of CLB sites located on the right edge of CLB 400 (shown in FIG. 4(A)) to a second group of CLB sites located on the left edge of CLB 400 (FIG. 4(B)). During this move, the upper left logic portion of core A is moved from CLB site CLB(1,8) in FIG. 4(A) to CLB site CLB(1,1) in FIG. 4(B). Typically, core movement is implemented by a gradual migration of the core from an initial position to a final position, where each step of the gradual migration includes an incremental shift of configuration data from one CLB site to an adjacent CLB site.
As used herein, the term "annealing" refers to a part of the placement process in which initially-placed logic portions are re-allocated in an attempt to form an optimal place-and-route solution based on user-defined criteria or other considerations. For example, assume that FIG. 4(A) illustrates one possible example of an initial placement of cores A, B, C, D and E during the place-and-route process. Further, assume that FIG. 4(B) illustrates an optimal arrangement of these cores based on criteria supplied by the user (e.g., the user requires all of the cores to be closely packed along the left edge of the target FPGA 400). Annealing methods are used to move the cores between the CLBs of FPGA 400 from the initial placement arrangement shown in FIG. 4(A) in an attempt to find the optimal placement solution show in FIG. 4(B).
Annealing methods for core-based PLD programming methods can be generally divided into two types: a non-overlapping annealing method type in which overlapping (i.e., assigning logic portions from two or more cores to a single CLB site) is not permitted, and an overlapping method type in which overlapping is permitted. These two method types are illustrated in FIGS. 5(A) through 6(D).
FIGS. 5(A) and 5(B) illustrate the first annealing method type wherein overlapping is not permitted. FIG. 5(A) is a graphical representation showing how the CLB assignment arrangements of large cores prevent free movement of the cores. As shown in FIG. 5(B), configuration data table 206(1) provides only a single memory value location for each memory cell of each CLB. Therefore, overlapping is precluded because each memory cell of a CLB site can hold data for only one core. Because cores must maintain their CLB assignment arrangement, large cores (such as core C) "block" the migration of other cores. For example, as shown in FIG. 5(A), the elongated core C is centrally located between cores A and cores B, D and E. In addition, core C is mapped into nine of the ten CLB sites associated with two columns of CLBs, leaving only the first row of these columns free for the assignment of logic portions. Therefore, only cores that have CLB assignment arrangements that are one CLB high (i.e., are entirely located in a single CLB row) are permitted to migrate past core C. As a result, the movement of core A from the right side to the left side of FPGA 400 is blocked by core C because core A is assigned to seven CLB rows. Likewise, the movement of cores D and E from the left side to the right side of FPGA 400 is blocked by core C because core D and E are assigned to three and six CLB rows, respectively.
FIGS. 6(A) through 6(D) illustrate the second annealing method type in which core overlaps are permitted during migration. As shown in FIG. 6(A), as core A migrates toward the left side of FPGA 400 (as indicated by the arrow), some of the logic portions of core A overlap some of the logic portions of core C. Specifically, a group 610 of twelve CLB sites (surrounded by the oval) in the fifth and sixth columns of FPGA 400 are concurrently assigned to both core A and core C. This creates an overlap condition in which logic portions of cores A and C are provisionally mapped into the same CLB sites. of course, this overlap condition must be removed in the final placement solution, but is tolerated during the annealing process because it allows the placement software to find the optimal placement solution (e.g., in which core A is migrated to the left of core C). As shown in FIG. 6(B), when core A is migrated fully to the left (i.e., situated above core B), an overlap is produced between core A and cores D and E. To relieve this overlap, for example, cores D and E are migrated to the right side of core C, as indicated by the arrows. Subsequent annealing procedures result in the optimal placement solution shown in FIG. 4(B).
Although overlap permissive annealing methods are desirable because they produce optimal placement solutions, they require a large amount of memory, and require a very long "run" time for the place-and-route software to find an optimal placement solution. Two such overlap permissive annealing methods are described below, and are referred to as a fixed memory annealing method and a "allocate/de-allocate on the fly" annealing method.
FIG. 6(C) shows a representation of memory storing a first modified configuration data table 206(2) in accordance with the fixed memory annealing method. In accordance with the fixed memory annealing method, several sets of configuration data memory are reserved for storing overlap information. For example, as shown in FIG. 6(C), modified configuration data table 206(2) includes five pre-defined sets of memory locations (FPGA-1 through FPGA-5) that allow up to five overlapping cores to be provisionally assigned to a single CLB site. For example, when cores A through E are in the initial placement arrangement shown in FIG. 4(A), where no overlaps exist, all of the configuration data is stored in a primary memory location (e.g., in memory set FPGA-1). Subsequently, as core A migrates from the initial placement arrangement to the overlapping condition shown in FIG. 5(A), the CLB sites initially occupied by logic portions of core C are overlapped by logic portions of core A. To accommodate this overlap, the logic portions of core A are stored in a secondary memory set (e.g., FPGA-2). Similarly, subsequent overlapping conditions are stored in secondary memory sets FPGA-3, FPGA-4 and FPGA-5.
There are several problems associated with the fixed memory annealing method. First, the number of overlapping cores is limited to the number of memory sets. For example, the modified configuration data table 206(2) shown in FIG. 6(C) is limited to five overlapping cores. If a sixth core migrates to a CLB site already occupied by five other cores, then the migration of the sixth core must be rejected as an illegal move. Another problem associated with fixed memory annealing methods is that, even when relatively few memory sets are provided, vast amounts of memory are required. The use of large amounts of memory can result in slower processing in some memory-restricted systems.
FIG. 6(D) shows a representation of memory storing a second modified configuration data table 206(3) in accordance with the "allocate on the fly" annealing method. Instead of fixed memory locations, the "allocate on the fly" annealing method uses a first (FPGA-PRIMARY) memory, and a multipurpose temporary (FPGA-TEMP) memory that is configured as necessary to store the overlap data for any given CLB site. For example, when cores A through E are in the initial placement arrangement shown in FIG. 4(A), where no overlaps exist, all of the configuration data is stored in the FPGA-PRIMARY memory. Subsequently, as core A migrates from the initial placement arrangement to the overlapping condition shown in FIG. 5(A), the CLB sites initially occupied by logic portions of core C are overlapped by logic portions of core A. To accommodate this overlap, portions of the FPGA-TEMP memory are addressed to identify the CLB sites in which the overlap occurs, and the logic portions of overlapping core A are stored in these newly-addressed memory locations. For example, assume a logic portion of core C is stored in CLB site 1,2 (identified as 1,1,2 in FIG. 6(D)) of the FPGA-PRIMARY memory set, and a subsequent migration by core A causes an overlap at CLB site 1,2. To accommodate this overlap, the placement software generates a secondary memory location in the FPGA-TEMP memory set (identified as 2,1,2 in FIG. 6(D)), then transfers the overlap data to the secondary memory location. Similarly, subsequent overlapping conditions are stored in the FPGA-TEMP memory set as memory location 3,1,2, 4,1,2, etc.
The "allocate on the fly" annealing method has certain advantages over the fixed memory annealing method because it does not require large amounts of reserved memory, and because it can allow a virtually unlimited number of overlaps at a given CLB site. However, the "allocate on the fly" annealing method is time intensive because it requires the placement-and-routing software to generate and address the temporary memory locations during the placement process.
Another problem with overlap permissive annealing methods is that they sometimes produce "optimal" placement solutions that include one or more overlapping cores. As mentioned above, overlap permissive annealing methods generate infeasible intermediate states (i.e., wherein overlaps between the cores exist) in the course of iterative placement improvement. The use of infeasible intermediate states provides great flexibility to the optimization process, thereby hastening the progress towards an optimal solution. To ensure that the overlaps are completely removed by the end of the placement phase, placement algorithms dynamically increase the overlap-related weights as the optimization proceeds. While progressively increasing the penalty for overlaps during annealing successfully removes all overlaps in most cases, it cannot guarantee that overlaps will be removed in all cases. This is especially true in the programming of PLDs, such as FPGAS, where the number of sites (CLBs) into which the logic portions may be placed is fixed. Also, because very few overlaps remain at the end of the placement process, it is desirable to obtain a feasible placement solution (with no overlaps) that is as close to the original infeasible solution as possible.
What is needed is a post-placement residual overlap removal method for use with overlap-permissive annealing processes that identifies the minimal placement change necessary to generate a feasible placement solution while removing all overlaps.