In computing, especially digital signal processing, multiply-accumulate operations are commonly used to accumulate a number of products in successive fashion. A conventional multiply-accumulate unit 100 (MAC unit) includes a multiplier 102, an adder 104, and an accumulator register 106 coupled as shown in FIG. 1. The output of the accumulation register 106 is fed back to one input of the adder 104, so that on each clock the output of the multiplier 102 is added to an accumulated total stored in the register 106.
FIG. 2 shows an example of a conventional MAC operation as a series of waveforms 200, and is discussed in the context of MAC unit 100 of FIG. 1. In this example, which extends over ten accumulate cycles, the multiplier 102 delivers a product for each cycle by multiplying an input data value by a weighting factor. The products are then successively accumulated by adder 104 and accumulation register 106.
More specifically, for a first cycle 202, the multiplier 102 multiplies a data value of 7,169 with a weighting factor of −128, thus outputting a product of −917,632 for the first cycle 202. The adder 104 adds this product of −917,632 to the current value (i.e., 0) in the accumulation register 106, and thus outputs a sum of −917,632. At the end of the first cycle 202, the accumulate register 106 has latched an accumulated total of −917,632. During a second cycle 204, the multiplier 102 multiplies an input data value of 7,169 with a new weighting value of −448, and outputs a product of −3,211,712 during the second cycle. The adder 104 adds −3,211,712 to the accumulated total of −917,632 stored in register 106, such that the accumulate register 106 has latched a value of −4,129,344 at the end of the second cycle 204. Multiplying and accumulation continues in this manner until the products have been accumulated over all cycles, here resulting in a total of −7,348,225 by the tenth cycle 206. Because the input data was a signed 14-bit binary number, it is often desirable to deliver a 14-bit binary number as a MAC result at the output. Hence, the 10 least significant bits (LSBs) of the output are truncated in this example, thereby giving a final MAC result of −7,177 at the end of processing at 206.
Because the MAC unit 100 is implemented in binary logic, the MAC unit 100 for this example requires, at minimum, a 14-bit by 10-bit multiplier, a 25-bit adder, and a 25 bit accumulation register. Thus, although the conventional MAC unit 100 can compute a MAC operation quickly, it requires a large amount of combinatorial logic due to, for example, the multiplier. This large amount of combinatorial logic requires a correspondingly large area, and consumes a correspondingly large amount of power.
As today's consumers expect smaller and cheaper electronic devices that can operate for longer periods on a single battery charge, this disclosure provides improved MAC units that tend to require less area and consume less power than conventional MAC units.