Circuits such as programmable matrix-multiply-and-accumulate units commonly use multiplier circuits. The area and power consumption of multiplier circuits may benefit from reducing the number of partial products that are input to multiplier trees in these circuits.
One conventional way to reduce the number of partial products is to utilize a higher radix encoding, such as radix four (4). Consider a multiply operation A×B where A is the multiplicand and B is the multiplier. Some conventional designs utilize Booth encoding, which generates select signals from multiplier B to select a shift of multiplicand A by an AND-OR (AO) multiplexer (mux). This operation is followed by an exclusive OR (XOR) to negate the output of the AO mux. A drawback of this approach is that each partial product of the multiplication operation has an associated negating XOR, which is prone to power consuming glitches in some circumstances.
A so-called Booth Mux or Booth selector may be utilized in some solutions. A Booth Mux is effectively a 5-input multiplexer that selects an output from among 0, A, 2A, −A, and −2A for every two bits of operand B. This is inefficient for multipliers that utilize radix-4 encoding, in which a 4-input mux is sufficient.
Some conventional solutions do utilize 4-input muxes. One such solution utilizes a mux that selects its output from among 0, A, 2A, and 3A. These solutions utilize an extra adder to generate 3A by adding A and 2A. Another conventional solution utilizes a 4-input mux that selects its output from among 0, A, 2A, and −A. Those solutions utilize a carry chain between partial product select encoders, which introduces latency, especially for wide-input multipliers.
A need therefore exists for solutions utilizing 4-input muxes that do not incur the latency of conventional carry chains.