1. Field of the Invention
This invention relates in general to field programmable logic devices (FPLDs), and in particular to a method for programming an FPLD using a library based-technology mapping algorithm (LBTMA).
2. Description of the Prior Art
The term FPLD describes numerous types of field programmable integrated circuit (IC) devices, including programmable logic arrays and field programmable gate arrays. FPLDs are general purpose logic devices having a general circuit which incorporates programmable components disposed at predetermined circuit locations. The programmable components can be based on, for instance, fuse, antifuse, EPROM or transistor technologies. FPLDs are programmed to perform a specific function by opening, closing or enabling selected programmable components, thereby forming a subcircuit within the general circuit which provides the desired logic function. FPLDs are popular replacements for custom ICs in many circuit applications because they can be programmed to implement a circuit design much more quickly and less expensively.
Software design tools are computer programs which simplify the process of programming an FPLD to implement the logic defined by a circuit design. Software design tools execute a wide variety of tasks which typically include analysis and minimization of circuit designs, determining if a circuit design can be implemented by a target FPLD, partitioning a circuit design into logic functions which can be implemented by a target FPLD, selecting which circuit design logic function is implemented by which FPLD logic function, determining connections of FPLD interconnect resources necessary to implement the circuit design logic, providing performance information and generating a set of commands for automatic configuration of FPLDs using a device programmer. A circuit design may be first converted into a computer-readable form using a hardware description language, or using a schematic capture program. Typical software design tools then perform a "logic optimization" process which includes minimizing the number of logic elements necessary to provide the logic functions defined by the circuit design. Next, typical software design tools perform "technology mapping" in which logic of the circuit design is divided into component logic functions. These component logic functions are then compared to and matched with logic functions implemented by a target FPLD. Because the different FPLDs can implement different logic functions, technology mapping of a circuit design must be performed for each target FPLD. After technology mapping is complete, placement and routing are performed and a bit stream is typically generated which represents the programmed states of all of the programmable components of the target FPLD.
As the number of FPLD manufacturers has increased, the circuit designers' choice of FPLDs has also increased. Because a FPLD's price typically increases with the size of the FPLD, it is cost effective for a circuit designer to spend time determining the lowest priced FPLD which can implement a particular circuit design and meet performance requirements. However, the hardware description languages and software design tools of different FPLD manufacturers are typically incompatible. Therefore, to identify the most cost-efficient FPLD for implementing a particular circuit design using FPLD manufacturers' software design tools, circuit designers are required to obtain several software design tools and hardware description languages, convert the circuit design into each of the hardware description languages, execute the software design tools, and then compare the results.
The problems caused by the existence of several incompatible software design tools was solved in part by the emergence of third-party software design tool vendors. Third-party vendors provide universal design tools which can be used to perform technology mapping for a circuit design into FPLDs produced by several FPLD manufacturers, not just one. Universal tools are popular with circuit designers because they provide a convenient and time efficient method of determining the most cost-efficient FPLD for implementing a circuit design.
To sell FPLDs to circuit designers who use the services of third-party tool vendors, it is in the best interest of FPLD manufacturers to provide the third-party vendors with information regarding their FPLDs which is in a form usable by the third-party tool vendor's universal tool.
The first universal tools developed by third-party tool vendors were developed for the precursor to FPLDs, namely gate arrays and standard-cell systems. One type of software design tool developed by third-party tool vendors for gate arrays and standard cells which is now also used to program FPLDs is known as a library-based technology mapping algorithm (LBTMA). LBTMAs compare a circuit design to FPLD information stored as a library of elements, each element representing one logic function which can be implemented by a FPLD. During technology mapping, LBTMAs partition the logic of a circuit design into logic portions which are compared with the library elements until a match is found, and the matching library element is assigned to the logic portion. The library elements may have different costs. A library element that requires many FPLD resources may have a higher cost than one that requires few resources. The LBTMA attempts to find a set of library elements that implement the functions of the circuit design with the least total cost.
One third-party tool vendor using this type of software design tool is Synopsys of Mountain View, Calif. Other LBTMAs are described in a paper by Hans-Jorg Mathony and Utz G. Baitinger entitled "CARLOS: An Automated Multilevel Logic Design System for CMOS Semi-Custom Integrated Circuits" published in IEEE Transactions on Computer-Aided Design, Vol. 7, No. 3, March 1988 at pages 346-355, and a paper by Kurt Keutzer of AT&T Bell Laboratories entitled "DAGON: Technology Binding and Local Optimization by DAG Matching" in the conference papers of the 24th ADM/IEEE Design Automation Conference, 1987 at pages 341-347.
LBTMAs are particularly well suited to gate arrays, standard cells, and certain FPLDs because manufacturers of these devices previously defined efficient implementations of specific logic functions. The number of gates which can be implemented by a field programmable device is typically on the order of several thousand, while mask programmed gate array devices typically implement 100,000 gates. The time and memory required for an LBTMA to map a circuit design into a target device such as a gate array is typically dependent upon the number of elements in the library used to program the target device. That is, for any given circuit design, the time required to perform technology mapping of the circuit design using a library of 500 elements is typically longer and takes more computer memory than to map the circuit design using a library of 200 elements.
One FPLD is a field programmable gate array (FPGA). An FPGA is typically organized as shown in FIG. 1. FPGAs are generally characterized in that they consist of a matrix of configurable logic blocks (CLBs) 11 surrounded by input/output blocks (IOBs) 12. In FIG. 1 lines 13 drawn between the rows and columns of CLBs are provided for showing a network of interconnect resources which can be configured to provide desired connections between two or more CLBs and between CLBs and IOBs. In an actual layout, these interconnect resources are not necessarily disposed between the CLBs and IOBs.
CLBs are groups of configurable logic elements connected to the interconnect resources through input lines and control lines. The configurable logic of typical CLBs includes one or more 2-, 3-, or 4-input lookup table based-function generators. An example of a CLB produced by Xilinx, Inc., of San Jose, Calif., under the series number XC4000 is shown in FIG. 2a. The CLB 11 of FIG. 2a contains F, G and H programmable function generators 21, two flip-flops 22, and an internal control section 23. There are eight logic signal inputs 24, a common clock input 25, and four control inputs 26. Logic functions of two, three, four or five inputs are generated by the F, G and H function generators 21. Data input for either flip-flop 22 within the CLB is supplied from the outputs of the lookup tables associated with function generators 21, or the control inputs 26.
Function generators typically include n input lines which are used to address 2.sup.n single-bit memory locations (a "lookup table"). When a combination of input signals are applied to the input lines, the contents (high or low signal) of the single memory location addressed by the input signals is applied to an output line. The 2.sup.n memory locations of the lookup table are programmed to provide output signals representing a desired Boolean function. Because all 2.sup.n outputs are stored in the lookup table, functions generators can implement any of 2.sup.2.spsp.n Boolean functions of n inputs.
An example of a function generator 21 is disclosed in U.S. Pat. No. 5,343,406 and in "The Programmable Gate Array Data Book", published in 1991 by Xilinx, Inc., herein incorporated by reference. FIG. 2b illustrates an equivalent circuit of the function generator 21 disclosed in these references. The equivalent circuit includes a multiplexer (MUX) tree 201 made up of MUXs 1 to 15, input signals a-d connected to the select inputs of MUXs 1 to 15, and a lookup table made up of memory locations M1 to M16 connected to the data inputs of MUXs 1 to 8. Memory location M1 is connected to the upper data input of MUX 1 and M2 is applied to the lower data input of MUX 1. When input signal "a" is "0" or low, MUX 1 connects the logic state of memory location M1 to output line OL1. Similarly, when input signal "a" is "1" or high, MUX 1 connects the logic state of memory location M2 to output line OL1. Memory locations M3-M16 are similarly connected to MUXs 2 to 8 such that odd-numbered memory locations M3, M5, . . . M15 are applied to output lines OL2-OL8 when input signal "a" is low, and even-numbered memory locations M4, M6 . . . M16 are applied to output lines OL2-OL8 when input signal "a" is high. As illustrated, output line OL1 is applied to the upper data input of MUX 9. Similarly, output line OL2 from MUX 2 is applied to the lower data input of MUX 9. Therefore, when input signal "b" is low, the signal on output line OL1 is connected to output line OL9. The remainder of the MUX tree functions in a similar manner.
Each of the 16 memory locations M1-M16 can be programmed to connect a "0" (low) logic state or a "1" (high) logic state to the MUXs 1 to 8. One of ordinary skill in the art would recognize that the 16 memory locations can be programmed in 2.sup.16 patterns of "1s" and "0s", representing 2.sup.16 logic functions of the four input signals a-d. However, as described in U.S. Pat. No. 5,343,406, the logic states of memory locations M1-M16 can be changed after an initial configuration process. This allows some functions of more than four input signals to be implemented by the "4-input" function generator illustrated in FIG. 2b. For example, by programming memory location M1 to represent the logic state of a fifth input signal "e", as shown in FIG. 2c, the "4-input" lookup table of FIG. 2b can implement some 5-input logic functions. Moreover, the lookup table of FIG. 2b can be redrawn as shown in FIG. 2d, wherein each of the input signals IN1-IN16 can represent either dedicated logic states (such as the memory locations of a lookup table) or variable logic states (such as additional input signals).
It should be noted that the use of a memory location to provide additional input signals does not create a lookup table implementing 2.sup.2.spsp.n+m logic functions, where m is the number of memory locations replaced by input signals. Instead, the number of logic functions implemented by an (n+m)-input function generator is 2.sup.2.spsp.n.sup.-m multiplied by 3.sup.m. For example, FIG. 2c illustrates a 5-input function generator where the fifth input "e" replaces memory location M1. In this example n=4, m=1, and the number of implemented logic functions is 2.sup.(16-1) .times.3.sup.1, or 3.times.2.sup.15 (98,304). It is clear that the number of 5-input logic functions implemented by the structure of FIG. 2c is far less than the 2.sup.32 logic functions implemented by a 5-input lookup table-based function generator addressing 32 memory locations. As a further example, if 16 input signals are applied to IN1-IN16 of FIG. 2d, in addition to the four input signals a-d, then the number of 20-input logic functions implemented would be 2.sup.0 *3.sup.16, or 3.sup.16. One of ordinary skill in the art would recognize that this (n+m)-input structure would implement logic functions which are a subset of the 2.sup.2.spsp.n+m logic functions implemented by an n-input function generator addressing 2.sup.n memory locations.
It should also be noted that additional input signals can be incorporated into the MUX tree of FIG. 2d by connecting one or more additional logic gates to any of the input signals a-d. In particular, if an AND gate is connected to receive input signals "a" and "a'", logic states "1" applied to both input signals "a'" and "a" would cause a "high" signal to be applied to MUXs 1 to 8. Many such combinations of fixed logic and programmable logic are possible.
As shown in FIG. 2e, one of ordinary skill in the art recognizes that the 4-input MUX tree 201 shown in FIGS. 2b-2d can be implemented by, for example, five 2-input MUX trees 211, 213, 215, 217 and 219, as shown in FIG. 2e, each MUX tree including three MUXs which address four data signals. Further, it is also recognized by one of ordinary skill in the art that adding logic gates to the inputs of each 2-input MUX tree, as shown in FIG. 2e, is another way to combine fixed and programmable logic.
FIG. 2f illustrates a second FPLD structure which is equivalent to the function generator structure described above. That is, the structure in FIG. 2f generates output signals which are equivalent to those output by a lookup table based-function generator. In FIG. 2f, four input signals a-d are applied to input lines a-d. AND gates 1 to 8 are connected to the input lines via programmable elements 220. The outputs of AND gates 1 to 8 are applied to OR gate 221. AND gates 1 to 8 are programmed by selecting which of programmable elements 220 is connected to its respective input signal a-d. The AND gates 1-8 generate a "1" or high signal when a predetermined "address", or input signal combination, is applied to the input signals a-d. For instance, AND gate 1 can be programmed to respond to the input signal combination "0010"; that is, input signals "a", "b" and "d" are low, and input signal "c" is high. When this input signal combination is present, AND gate 1 outputs a high signal, which is passed through OR gate 221 to an output line. The AND gates can be thought of as one or more "memory locations" of a lookup table, each AND gate "storing" one or more logic states of "1", or high. More that one "memory location" can be implemented by one AND gate because fewer than n address signals can be connected to address the AND gate. That is, if n is four and an AND gate is programmed to respond to the input signal combination "001" (input signals "a" and "b" are low, input signal "c" is high), the AND gate will output a high signal for both input signal combinations "0010" and "0011" because the AND gate is addressed whether input signal "d" is high or low. Further, addresses for which a "0" or low logic state is to be output are not programmed to be recognized by any AND gate. Therefore, although the circuit structure of FIG. 2f appears very different from the MUX tree structure of FIGS. 2b-2e, the resulting logic is equivalent to a lookup table based-function generator. The n input signals of this structure address 2.sup.n "memory locations". Note that if a logic pattern contains more than eight high signals, the inverse of the logic pattern is generated and the output is inverted, as depicted by phantom invert function 222. Thus the AND gate array can be made up of 2.sup.n /2 or less AND gates (representing high logic states). In FIG. 2f, the combination of eight or less high logic states combine to represent the high output signals which will be provided in response to the certain addresses sent by the four input signals. Therefore, just as the lookup table based-function generator discussed above, this circuit structure implements all 2.sup.16 possible functions of four inputs using the eight AND gates 1 to 8, the structure of FIG. 2f also implements all 2.sup.16 possible functions of four inputs.
Unlike the MUX tree structure described above, the addresses of the AND gates are programmable. As with the MUX tree of FIGS. 2b-2e, the circuit structure of FIG. 2f can be constructed for any number n of input signals wherein fewer than 2.sup.n /2 AND gates are used. Note that if fewer than 2.sup.n /2 AND gates have been provided for an n-input signal structure, then the number of input functions implemented by the n inputs will be less than 2.sup.2.spsp.n logic functions.
One of ordinary skill in the art will recognize that there are other circuit structures which, like the structure of FIG. 2f, are equivalent to the lookup table based-function generator of FIGS. 2b-2e. The structures described above and any other equivalent circuit structures are referred to below as "lookup table based-FPLDs." In addition, the term "lookup table" refers to the memory locations or data input signals associated with the MUX tree structures of FIGS. 2b-2e and any equivalent logic, such as the equivalent logic associated with FIG. 2f.
A problem with lookup-table based-FPLDs is that they are not efficiently programmed using LBTMA software design tools because the prior art libraries used to describe all logic functions implemented by a lookup table (or equivalent structure) were too large. This problem arises because an n-input lookup table can implement any Boolean function having n (or less) inputs by programming the associated memory locations (or data input signals) to implement the Boolean function. The 2.sup.n memory locations associated with an n-input lookup table can be programmed to store up to 2.sup.2.spsp.n memory patterns, each memory pattern representing one logic function. For example, the 16 memory cells associated with a four-input lookup table can be programmed in 2.sup.16, or 65,536, different memory patterns. In addition, two four-input lookup tables combined to form a five-input lookup table can be programmed in 2.sup.32, or 4,294,967,296 different memory patterns to represent the outputs of 2.sup.32 different five-input Boolean functions. As discussed above, the larger the number of library elements, the longer it typically takes for an LBTMA to perform technology mapping for a circuit design. Clearly, technology mapping using a library of over 4 billion elements requires an impractical amount of time. Even a library of 65,536 elements is impractical. Software design tools specifically developed to perform technology mapping of lookup table based-FPLDs operate efficiently because they do not use libraries. However, circuit designers and third-party tool vendors are typically not interested in running both LBTMAs and design tools specifically developed for lookup table based-FPLDs. In addition, they are usually not willing to pay for the impractical amount of time necessary for an LBTMA to perform technology mapping for FPLDs using libraries having tens of thousands of elements. When this happens, FPLD manufacturers having devices defined by large libraries lose potential customers.
One prior art solution to the problem associated with the excessively large number of library elements needed to define the logic functions implemented by a lookup table-based FPLD was to reduce the number of library elements by a process referred to herein as "pin swapping". Pin swapping recognizes that although a lookup table can be programmed to implement a large number of logic functions, many of those functions are duplicates which differ only in that input signals are placed differently. An example of pin swapping is shown in FIGS. 3a and 3b for a 2-input lookup table. The symbols "*" and "+" located between the two input descriptors "a" and "b" are used to indicate AND and OR operations, respectively. The symbol "!" located in front of an input descriptor indicates an invert operation. FIG. 3a lists all 2.sup.4, or 16, specific logic functions which can be implemented in a 2-input lookup table. The specific logic functions which differ only in that input signals are placed differently, or rearranged, are grouped as indicated and listed in FIG. 3b. That is, inputs "a" and "b" each represent a single input signal. Because a single input can be applied to either input pin "a" or pin "b", there is no need to include both specific logic functions in the library. Therefore, as indicated in FIG. 3b, the two specific logic functions "a" and "b" are grouped together and represented by the general logic function "a". Similarly, "!a" and "!b" are grouped and represented by the general logic function "!a". Specific logic functions "!a'b" and "a*!b" contain one inverted and one non-inverted input signal. It is clear that these two specific logic functions differ only in that non-inverted input signals and inverted input signals are applied to different input pins. Therefore, both specific logic functions "!a*b" and "a*!b" can be represented by the general logic function "!a*b". Similarly, "!a+b" and "a+!b" can be represented by the general logic function "!a+b". Through the process of pin swapping, it is shown that the number of general logic functions needed to represent all specific logic functions implemented by a 2-input lookup table is 12.
A second example illustrating the pin swapping process for a 4-input lookup table is shown in FIGS. 4a and 4b. The 65,536 specific logic functions implemented by a 4-input lookup table include the logic functions shown in FIG. 4a, where "a", "b", "c" and "d" represent the four input signals, "*" represents an AND operation, "+" represents an OR operation and "!" represents an invert operation. FIG. 4a includes all 4-input specific logic functions wherein only two of the four inputs and one AND or OR operation are used to generate an output. Similar to the first example, above, which regards 2-input lookup tables, the bracketed groups of specific logic functions in FIG. 4a are represented by the general logic functions shown in FIG. 4b. That is, the bracketed group of specific logic functions including a*b, a*d, etc. are represented by the function c*d. Likewise, the specific logic functions !a*b, a*!b, . . . are represented by the general logic function c*!d. Other general logic functions and their corresponding groups of specific logic functions are shown in FIGS. 4a and 4b, respectively.
When the process of grouping all specific logic functions which differ only by rearranging the inputs is performed for all of the possible Boolean functions of four inputs, the 65,536 specific logic functions are represented by 3,984 general logic functions, or library elements. However, even though a library having 3,984 elements is significantly smaller than a library having 65,536 elements, an impractical amount of time is still necessary to run an LBTMA using this library.
Another method for further reducing the number of library elements is to simply remove elements which are not considered likely to be used. This method has been used in the FPLD-programming industry with some frequency. However, this method leaves open the possibility that a 4-input logic function could not be mapped into a single 4-input lookup table, which would reduce density of logic functions per unit of silicon area, resulting in poor programming results.