The invention relates to a method of designing and making hardware circuits, particularly implementable in integrated circuit form, for executing multiple sum-of-products operations, and to circuits made by the method.
Many common operations found within fixed-point Digital Signal Processing (DSP) and Graphics algorithms in integrated circuits can be expressed as a fixed-point sum-of-products (SOP). These include adders, subtractors, multipliers, squarers, multiply-accumulators (MACS), chained additions, decrementors and incrementors, for example. An SOP can be efficiently implemented in hardware, as the partial products for each product can all be summed in parallel.
Previous work has considered improvements to the final carry propagate adder of an SOP, (S. Das and S. P. Khatri, “A timing-driven synthesis approach of a fast four-stage hybrid adder in sum-of-products,” in MWSCAS: 51st Symposium on Circuits and Systems, August 2008, pp. 507-510). In S. Das and S. P. Khatri's, “An inversion-based synthesis approach for area and power efficient arithmetic sum-of-products,” in VLSID: 21st International Conference on VLSI Design, January 2008, pp. 653-659, inverted partial product arrays were shown to improve quality of results. Designs implementing operations of the form Σ kixiyi where ki are constants and xi and yi are input operands have been considered (D. Kumar and B. Erickson, “Asop: arithmetic sum-of-products generator,” in ICCD: IEEE International Conference on Computer Design: VLSI in Computers and Processors, October 1994, pp. 522-526). Here multiplication by a constant was performed by using the canonical signed digit recoding and xi*yi is computed in redundant carrysave form. Product-of-sum (POS) expressions have also been optimized; (a+b)c, which could be implemented as an addition and multiplier in series, can be expanded to ac+bc, and in fact a whole host of intermediate designs can be created, as demonstrated by S. Das and S. P. Khatri, “A timing-driven synthesis technique for arithmetic product-of-sum expressions,” in VLSID: 21st International Conference on VLSI Design, January 2008, pp. 635-640, where timing constraints are used to determine which architecture to use. In fact there is a further wealth of design options for POS expressions; they can be incorporated into Booth encoded multipliers in a variety of styles (R. Zimmerman and D. Q. Tran, “Optimized synthesis of sum-of-products,” in 37th Asilomar Conference on Signals, Systems and Computers, vol. 1, November 2003, pp. 867-872).
Despite the existence of efficient implementations of SOP and POS expressions, most datapath synthesis cannot exploit these highly optimal blocks due to non-SOP expressions found within the datapath. Muxing (multiplexing) and shifting found within SOP expressions prevent full and efficient merging. In A. K. Verma and P. lenne, “Improved use of the carry-save representation for the synthesis of complex arithmetic circuits,” in ICCAD: IEEE/ACM International Conference on Computer Aided Design, November 2004, pp. 791-798, data flow graphs have been locally manipulated to increase the proportion of the datapath which can be expressed as a single SOP, hence reducing delay and area. For example one of the transformations includes (a+b+c)<<d=(a<<d)+(b<<d)+(c<<d), hence shifters can be moved through summations; a fact exploited more fully by S. Das and S. P. Khatri, “A merged synthesis technique for fast arithmetic blocks involving sum-of-products and shifters,” in VLSID: 21st International Conference on VLSI Design, January 2008, pp. 572-579.
In terms of considering mutually-exclusive SOP expressions, an example can be found in A. K. Verma and P. lenne, supra: sel?a+b:c=(sel?a:c)+(sel?b:0). However such optimizations were restricted to localized regions. A fuller consideration of merging mutually-exclusive operations can be found in S. Das and S. P. Khatri's “Area-reducing sharing of mutually exclusive multiplier, mac, adder and subtractor blocks,” in IASTED: 5th International Conference on Circuits, Signals and Systems, July 2007, pp. 269-272 and “Resource sharing among mutually exclusive sum-of-product blocks for area reduction,” TODAES: ACM Transactions on Design Automation of Electronic Systems, vol. 13, no. 3, pp. 51-57, July 2008. In this instance the SOP is split into partial generation, array reduction and final carry propagate adder with muxing on inputs to each of these units.