The invention relates to Programmable Logic Devices (PLDs). More particularly, the invention relates to an ALU (Arithmetic Logic Unit) implementation for a PLD that consumes only one PLD logic cell per bit of the ALU.
Programmable logic devices (PLDs) are a well-known type of digital integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of configurable logic blocks (CLBs) surrounded by a ring of programmable input/output blocks (IOBs). The CLBs and IOBs are interconnected by a programmable interconnect structure. Some FPGAs also include additional logic blocks with special purposes (e.g., DLLs, RAM, and so forth).
The CLBs, IOBs, interconnect, and other logic blocks are typically programmed by loading a stream of configuration data (bitstream) into internal configuration memory cells that define how the CLBs, IOBs, and interconnect are configured. The configuration data may be read from memory (e.g., an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Other types of PLDs are programmed using static memory, i.e., memory elements that are programmed once and retain that programming until erased or reprogrammed. These PLDs include, for example, CPLDs and antifuse devices. Other PLDs, called ASICs (Application Specific Integrated Circuits), are programmed by applying one or more customized metal layers to a previously manufactured standard base. Regardless of the type of PLD used, the configuration data used to program the device is generally provided in one or more computer programs.
Whatever the type of PLD used in a customer design, a significant benefit of programmable devices is the fact that the time required to design and implement a circuit is typically much shorter than the time required to design and manufacture a custom device. Therefore, in recent years PLD manufacturers have provided pre-designed xe2x80x9cmacrosxe2x80x9d, i.e., files that include programming information to implement a particular function using some or all of the resources of a targeted PLD. Some macros are configurable, meaning that the user can select certain functions to be included, set parameters such as bit width, or select a target PLD from a list of supported PLDs. The macro program generates a configuration data file that varies depending on the information provided by the user.
Efficient use of PLD resources is important, because such efficiency can allow a user design to fit into a smaller (and less expensive) PLD. For some very large designs, inefficient resource usage can result in an implementation so large it cannot be implemented in any PLD available from a given PLD provider. Therefore, a PLD provider providing macros that more efficiently implement common user functions in its own PLDs has a marketing advantage over its competitors. Hence, efficient PLD implementations of common functions are highly desirable.
One function often used in user designs is the ALU (Arithmetic Logic Unit) function. An ALU circuit typically supports several different functions, one of which is selected using operator input signals. Supported functions can include, for example, an adder function, a subtractor function, an increment function, a decrement function, a multiplexer function, and logical functions such as AND, OR, and XOR.
Patterson and Hennessy show and describe several ALU circuits in pages 182-198 of xe2x80x9cComputer Organization and Design: The Hardware/Software Interfacexe2x80x9d, published in 1994 by Morgan Kaufmann Publishers, Inc., which pages are hereby incorporated by reference.
Typically, ALU functions are provided for a single bit (e.g., two one-bit input signals are added together) in a one-bit ALU circuit. Two or more of these one-bit circuits are then combined to provide a multi-bit ALU function. The width of an ALU circuit can be, for example, 8, 16, or 32 bits. Therefore, an efficient implementation of a one-bit ALU function is highly desirable in terms of efficiently using PLD resources.
The invention provides structures and methods that implement an ALU (Arithmetic Logic Unit) circuit in a PLD (Programmable Logic Device) while using only one PLD logic cell to implement a one-bit ALU function. The term xe2x80x9clogic cellxe2x80x9d is used to indicate a group of configurable logic elements including one function generator (e.g., a look-up table) and one memory storage device (e.g., a flip-flop or a latch), with supporting logic. The logic capacity of a PLD is often specified as a number of xe2x80x9clogic cellsxe2x80x9d.
The ALU circuit has two data input signals and two operator input signals that select between the adder, subtractor, and other logical functions. A result bit provides the result of the addition, subtraction, or logical function as selected by the values of the two operator input signals. A carry chain is provided for combining the one-bit ALU circuits to generate multi-bit ALUs. All of this functionality is implemented in a single PLD logic cell per ALU bit.
According to a first embodiment of the invention, an ALU circuit includes a four-input function generator, an AND gate, a carry multiplexer, and an XOR gate.
The four-input function generator has as input signals first and second data input signals and first and second operator input signals. The function generator is configured to implement an XOR function, a first multiplexer function, and a second multiplexer function. The XOR function is an XOR function of the first and second data input signals and the first operator input signal. The first multiplexer function selects between first and second logical functions of the first and second data input signals, providing a result of the first logical function when the first operator input signal is high and providing a result of the second logical function when the first operator input signal is low. The second multiplexer function selects between the XOR output signal and the first multiplexer output signal, providing the XOR output signal when the second operator input signal is high and providing the first multiplexer output signal when the second operator input signal is low. The output of the second multiplexer is coupled to the function generator output terminal.
The AND gate is coupled to the first data input terminal and the second operator input terminal of the logic cell and has an AND output terminal. The carry multiplexer has a xe2x80x9czeroxe2x80x9d data input terminal coupled to the AND output terminal, a xe2x80x9conexe2x80x9d data input terminal coupled to the carry-in terminal of the logic cell, an output terminal coupled to the carry-out terminal of the logic cell, and a select input terminal coupled to the function generator output terminal. The XOR circuit has a first input terminal coupled to the function generator output terminal, a second input terminal coupled to the carry-in terminal, and an output terminal coupled to the result output terminal of the logic cell.
In one embodiment, the first logical function is simply the first data input signal, and the second logical function is the second data input signal. In another embodiment, the logic implemented by the function generator includes logic gates coupled between the first and second data input terminals and the first multiplexer. Thus, the first multiplexer function selects between two different logical functions of the first and second data input signals. In one embodiment, the first multiplexer selects between the AND function and the OR function of the first and second data input signals.
One PLD that can be used to implement the described circuit in a single logic cell is the Virtex(trademark)-II Field Programmable Gate Array (FPGA) provided by Xilinx, Inc. The Virtex-II CLB includes four similar slices, each including two logic cells. Each logic cell includes one four-input function generator implemented as a look-up table, as well as additional logic including at least one AND gate, multiplexer, and XOR gate. Therefore, the ALU circuit of the invention can be implemented in half of one Virtex-II slice. By concatenating the carry chains of the half-slices (i.e., by coupling the carry-out terminal of one half-slice to the carry-in terminal of another half-slice) up to eight ALU bits can be implemented in a single Virtex-II CLB.
According to another aspect of the present invention, a method is provided for configuring a PLD logic cell to implement one bit of an ALU function. The PLD logic cell includes a function generator, an AND gate, a carry multiplexer, and an XOR gate. The method includes a series of steps, which can be performed in any order. When the PLD is an FPGA, the steps are often performed simultaneously, by downloading a single bitstream (an FPGA configuration data file) into the FPGA, thereby configuring the FPGA to perform the desired functions.
According to one embodiment, the method of the invention includes configuring the function generator, configuring the AND gate functionality, configuring the carry chain functionality, and configuring the XOR gate functionality.
The function generator is configured to provide a function generator output signal. The function generator output signal is the result of a first logical function when the first operator input signal is high and the second operator input signal is low. The output signal is the result of a second logical function when the first and second operator input signals are both low. The first and second logical functions are each a function of at least one of the first and second data input signals. Finally, the output signal is an XOR function of the first and second data input signals and the first operator input signal when the second operator input signal is high.
The AND gate functionality is provided by configuring the logic cell such that the AND gate provides to the carry multiplexer an output signal comprising an AND function of the first data input signal and the second operator input signal.
The carry chain functionality is provided by configuring the logic cell such that the carry multiplexer selects between the AND gate output signal and a carry-in input signal of the logic cell. The selection is made based on the value of the function generator output signal. When the function generator output signal is low, the carry multiplexer provides the AND gate output signal. When the function generator output signal is high, the carry multiplexer provides the carry-in input signal. The selected signal is provided to a carry-out terminal of the logic cell.
The XOR functionality is provided by configuring the logic cell such that the XOR circuit performs an XOR function of the function generator output signal and the carry-in input signal, and the output of the XOR circuit provides the result output signal for the logic cell.
In one embodiment, a second logic cell is configured in a manner similar to the first logic cell. The two first operator input terminals are coupled together, and the two second operator input terminals are also coupled together, so the two logic cells perform the same function. The carry-out signal of the first logic cell is provided as the carry-in signal of the second logic cell. Thus, a two-bit ALU is formed. The chain can be extended in a similar fashion to virtually any length, with the bit-width of the ALU (i.e., the length of the carry chain) being determined by the available number of logic cells or by the operating speed required of the circuit.
According to a third aspect of the invention, a computer storage device is provided that includes configuration data for configuring a PLD logic cell to implement an ALU function. The logic cell includes a function generator, an AND gate, a carry multiplexer, and an XOR gate. The configuration data includes four sets of configuration data, which can be stored separately (i.e., in four separate files) or as a single file. If stored as a single file, the data sets can be separated out by function, or (as in the case of an FPGA) the data sets may be xe2x80x9cmixed upxe2x80x9d in a single configuration bitstream.
A first set of the configuration data configures the function generator to provide a function generator output signal. The function generator output signal is a result of a first logical function when a first operator input signal is high and a second operator input signal is low, a result of a second logical function when the first and second operator input signals are both low, and an XOR function of the first and second data input signals and the first operator input signal when the second operator input signal is high. The first and second functions are each a function of at least one of the first and second data input signals.
A second set of the configuration data configures the logic cell such that the AND gate provides to the carry multiplexer an output signal comprising an AND function of the first data input signal and the second operator input signal.
A third set of the configuration data configures the logic cell such that the carry multiplexer provides a carry-out signal to a carry-out terminal of the logic cell. The carry-out signal is the AND gate output signal when the function generator output signal is low, and a carry-in input signal of the logic cell when the function generator output signal is high.
A fourth set of the configuration data configures the logic cell such that the XOR circuit provides a result output signal comprising an XOR function of the function generator output signal and the carry-in input signal to a result output terminal of the logic cell.
In one embodiment, the computer storage device includes additional sets of configuration data that configure a second logic cell in a manner similar to the first logic cell. Additional sets of configuration data couple together the two first operator input terminals, and the two second operator input terminals, and (if not ensured by the PLD architecture) couple the carry-out terminal of the first logic cell to the carry-in terminal of the second logic cell.