This invention relates generally to integrated circuits and, in particular, to integrated circuits with multiplier circuitry.
Programmable logic devices (PLDs) include logic circuitry such as look-up tables (LUTs) and adder based logic that are designed to allow a user to customize the circuitry to the user's particular needs. This configurable logic is typically divided into individual logic circuits that are referred to as logic elements (LEs). The LEs may be grouped together to form larger logic blocks referred to as logic array blocks (LABs) that may be configured to share the same resources (e.g., registers and memory). In addition to this configurable logic, PLDs also include programmable interconnect or routing circuitry that is used to connect the inputs and outputs of the LEs and LABs. The combination of this programmable logic and routing circuitry is referred to as soft logic.
Besides soft logic, PLDs may also include specialized processing blocks that implements specific predefined logic functions and thus cannot be configured by the user. Such specialized processing blocks may include a concentration of circuitry on a PLD that has been partly or fully hardwired to perform one or more specific tasks, such as a logical or a mathematical operation. Examples of structures that are commonly implemented in such specialized processing blocks include: multipliers, arithmetic logic units (ALUs), barrel-shifters, various memory elements (such as FIFO/LIFO/SIPO/RAM/ROM/CAM blocks and register files), logic AND/NAND/OR/NOR arrays, etc., or combinations thereof.
One particularly useful type of specialized processing block that has been provided on PLDs is a digital signal processing (DSP) block. A conventional DSP block includes two 18-by-18 multipliers, which can be combined with other internal circuitry to form a larger 27-by-27 multiplier. The 27-by-27 multiplier is used as part of an IEEE 754 single precision floating-point multiplier, which requires 24 bits of precision.
Recent developments in artificial intelligence such as advancements in machine learning and deep learning involve training and inference, which have necessitated a much higher density of multiplications, especially at smaller precisions (i.e., multiplications with operands having less than 10 bits). As an example, machine learning inference might require performing a number of 8×8 multiplication operations. A conventional 18×18 multiplier is, however, only capable of supporting a single 8×8 multiplication.
It is within this context that the embodiments described herein arise.