This relates to a programmable integrated circuit device and particularly to a specialized processing block in a programmable integrated circuit device.
Considering a programmable logic device (PLD) as one example of an integrated circuit device, as applications for which PLDs are used increase in complexity, it has become more common to design PLDs to include specialized processing blocks in addition to blocks of generic programmable logic resources. Such specialized processing blocks may include a concentration of circuitry on a PLD that has been partly or fully hardwired to perform one or more specific tasks, such as a logical or a mathematical operation. A specialized processing block may also contain one or more specialized structures, such as an array of configurable memory elements. Examples of structures that are commonly implemented in such specialized processing blocks include: multipliers, arithmetic logic units (ALUs), barrel-shifters, various memory elements (such as FIFO/LIFO/SIPO/RAM/ROM/CAM blocks and register files), AND/NAND/OR/NOR arrays, etc., or combinations thereof.
One particularly useful type of specialized processing block that has been provided on PLDs is a digital signal processing (DSP) block, which may be used to process audio signals as an example. Such blocks are frequently also referred to as multiply-accumulate (“MAC”) blocks, because they include structures to perform multiplication operations, and sums and/or accumulations of multiplication operations.
Typically, the arithmetic operators (adders and multipliers) in such specialized processing blocks are fixed-point operators. If floating-point operators were needed, the user would construct them outside the specialized processing block using general-purpose programmable logic of the device, or using a combination of the fixed-point operators inside the specialized processing block with additional logic in the general-purpose programmable logic.
One impediment to incorporating floating-point operators directly into specialized processing blocks is the need for large addition operations as part of many floating-point operations. For example, floating-point multiplication may require two carry-propagate adders. The carry-propagate adder used in a multiplication operation is the most expensive component of the multiplier in terms of both area and latency.
It is within this context that the embodiments described herein arise.