A multiplier within a data processor may be implemented using a variety of methods. One known method is to use a dedicated hardware multiplier circuit. A dedicated hardware multiplier circuit typically has an array of full adder cells connected such that both the multiplicand and multiplier operands are multiplied by one another by utilizing a series of shift and addition operations. A problem with a dedicated hardware multiplier is the large physical area required to implement the multiplier. However, the advantage of a dedicated hardware multiplier is the relatively short amount of time required to process a multiply operation.
Another known method of implementing a multiply operation within a data processor is by utilizing a multiply recoding algorithm. The purpose of a recoding algorithm is to reduce the number of addition operations, which determine the partial products, required to complete a processor multiply instruction. A conventional add-shift multiply operation that has an M-bit multiplicand operand and an N-bit multiplier operand, where M and N are integers, typically requires an N-number of addition operations to complete a processor multiply instruction. By utilizing a recoding algorithm such as Booth's recoding algorithm or Modified Booth's recoding algorithm, the number of addition operations required to complete the multiply instruction can be significantly reduced. A multiplier that utilizes Booth's recoding algorithm is taught by Tokumaru et al. in U.S. Pat. No. 4,807,175, entitled "Booth's Multiplier." Tokumaru et al. utilize two separate adder units to calculate two separate intermediate partial products, then sum the two separate intermediate partial products with a previously formed full partial product in two additional adder units to calculate each new full partial product. The Tokumaru et al. multiplier increases the multiply operation performance at the expense of requiring additional adder units and fully duplicative recoding logic for each partial product calculation. A substantial increase in die area results in order to be able to calculate each and every possible partial product term. The Tokumaru et al. multiplier requires a significant amount of additional hardware to calculate each partial product simultaneously, before summing the calculated partial products with an arithmetic logic unit. As a result, the processing speed of the multiply operation can be improved by a factor for two. A disadvantage with known recoding multipliers which concurrently perform two partial product calculations is the required increase in transistor circuits and resulting die area to perform the additional partial product calculations.