1. Field of the Invention
The invention pertains to the field of programmable logic devices. More particularly, the present invention pertains to the field of product term based simple and complex programmable logic devices, generally known as SPLDs and CPLDs, respectively.
2. Description of the Related Art
FIG. 1 shows an Arithmetic Logic Unit (hereafter ALU) unit 100 of a conventional macrocell. The ALU 100 of FIG. 1 includes conventional carry chain circuitry. The ALU 100 includes a function generator 130 for carrying out arithmetic and logical operations upon the sum-of-product inputs D1 and D2. The carry chain of the conventional ALU 100 generates a Carry Output to the next macrocell from the AND gate 110, the OR gate 120 and the multiplexer referenced as MUX 140. The select line of MUX 140 is the output of the AND gate 150. The carry chain of FIG. 1 generates the Carry Output C.sub.o to the next macrocell (not shown) by implementing the equation: EQU C.sub.o =(A.sub.i *B.sub.i)*/C.sub.i +(A.sub.i +B.sub.i)*C.sub.iEqn. 1
where the "*" symbol indicates the logical AND operation, the "/" symbol indicates inversion and the "+" symbol indicates the logical OR operation. According to equation 1, the ANDed term (A.sub.i *B.sub.i) is selected when the output of the AND gate 150 is low, and the ORed term (A.sub.i +B.sub.i) is selected when the output of the AND gate 150 is high. The output of the AND gate 150 is the logical AND of the Carry Input C.sub.in of the current macrocell (the Carry Output C.sub.o from the previous macrocell in the chain) and a configuration bit CB. Therefore, as long as the configuration bit CB is high, the Carry Input C.sub.in will be propagated unchanged to the select line of the MUX 140. The output of the AND gate 150, together with the output of the function generator 130, is input to the XOR gate 160, the output of which is fed to the macrocell register (not shown).
However, the implementation of FIG. 1 suffers from a number of disadvantages, in terms of speed and arithmetic and logical functions. For example, the carry chain of FIG. 1 requires the Carry Input C.sub.in signal to be input to the AND gate 150, which C.sub.in signal to the macrocell is the Carry Output C.sub.o from the previous macrocell. To implement an adder that includes an initial Carry Input C.sub.in (other than a 0), the carry chain logic of the previous macrocell must be used. Indeed, setting the configuration bit CB to 0 insures that the select line to MUX 140 is 0. However, to propagate a carry bit of 1 to the select line of the MUX 140 requires that the configuration bit CB be set to 1 and the Carry Input C.sub.in signal from a previous macrocell (the Carry Output C.sub.o signal from the ALU of the previous macrocell) be available to the AND gate 150. Therefore, the ALU 100 of FIG. 1 and its associated carry chain is not capable of implementing an initial Carry Input C.sub.in (other than a 0) without utilizing the carry chain logic of a previous macrocell to generate its Carry Output C.sub.o signal, which is the Carry Input C.sub.in signal of the current macrocell
Another disadvantage of the architecture of FIG. 1 concerns the function generator 130. Indeed, the presence of the function generator 130 in the ALU 100 adds an extra level of delay to the existing AND/OR plane. That is, to implement a full adder in the ALU 100 requires that the function generator 130 produce D1 .sym.D2, where the ".sym." symbol indicates the logical XOR operation. The XORed output of the function generator 130 is then input to the XOR gate 160, together with the output of the AND gate 150, which is the Carry Output C.sub.o signal from the previous macrocell. In the configuration of FIG. 1, to generate the D1.sym.D2 term requires the inputs to the ALU 100 to pass through the function generator 130, and suffer propagation delay associated therewith. The delay through the function generator 130 is greater than through the logic gates 110 and 120, meaning that the D1.sym.D2 term is not available until well after the Carry Input signal C.sub.in and the output of AND gate 150 are available.
The presence of the function generator 130 and the general architecture of the ALU 100 also present other disadvantages. Since one of the inputs to the XOR gate 160 is tied to the function generator 130, the XOR gate is not available for other logic functions. For example, the XOR gate 160 is not readily adaptable to T flip-flop synthesis without routing signals outside the ALU 100. Although an XOR gate is a potentially useful element to synthesize a number of logic functions, the XOR gate 160 of FIG. 1 is not available for such synthesis, because it is tied to the function generator 130 shown in FIG. 1 and because it is tied to the carry output from the previous macrocell. For example, the XOR gate 160 does not lend itself to general-purpose logic without routing signals through the function generator 130, or to such functions as polarity control.
One of the main disadvantages of CPLDs and other programmable logic devices that do not contain dedicated carry chain circuitry is the size and performance of the arithmetic function implementations. Arithmetic function implementations in CPLDs can be optimized for area and/or speed. These optimizations, however, are based only on optimizing the topology of the implementation. Without dedicated carry chain circuitry, arithmetic function implementations that are optimized for speed require a large amount of device resources. The required resources can grow to become a significant portion of the targeted device, thereby limiting the amount of resources for other portions of the design. Conversely, implementations that are optimized for area require fewer device resources, but are typically much slower than those optimized for speed. In summary, the coarse-grain nature of the CPLD does not allow for a good speed/area tradeoff when implementing arithmetic functions.
Table 1 gives examples of several different adders targeted to a CPLD that contains no dedicated targeted carry chain circuitry, such as the CY7C375I-125AC, manufactured by the assignee of the present invention, Cypress Semiconductor Corp. of San Jose, Calif.
TABLE 1 ______________________________________ Product Array Size.sup.1 Optimization Terms Macrocells Passes Speed ______________________________________ 8 Bit Speed 83 18 3 19.5 ns 12 Bit Speed 136 47 19.5 ns 16 Bit Speed 193 38 19.5 ns 32 Bit Speed 461 78 19.5 ns 8 Bit Area 80 12 25.0 ns 12 Bit Area 126 18 36 ns 16 Bit Area 172 24 47 ns 32 Bit Area 356 48 16 9l ns ______________________________________ Note .sup.1 : Assumes an n + 1 bit result for a carry output.
What is needed, therefore, is a dedicated carry chain method and architecture for a programmable logic device macrocell that does not require the carry chain logic of a previous macrocell to implement an initial Carry Input other than 0. Also needed are macrocells and a method of propagating a carry between macrocells of a product term based programmable logic device that do not depend upon a function generator to implement the carry chain, and that do not suffer from the propagation delay penalties associated with conventional arithmetic function implementations. Also needed is a carry chain method and macrocell architecture that are optimized for general-purpose logic, XOR input of generic product terms and polarity control without, however sacrificing speed or flexibility. There has also been a long felt need for a carry chain architecture and method for readily implementing a T flip-flop without having to feed back the Q output of the macrocell D flip-flop back through the central interconnect, thus using up inherently limited input bandwidth.