Decimal multiplication is a complex operation to implement in computer hardware. Typical methods for implementing decimal multiplication involve the computation and accumulation of partial product terms. The two data inputs to the operation are called the multiplier and the multiplicand. Commonly, the multiplier is separated into individual digits, and these digits are used to select multiples of the multiplicand which form the partial products. The partial products are summed to form the final product.
An existing method for reducing the amount of computation or computation time required to generate the partial product term uses a read only memory (ROM) to store all the possible products of two digits which range from 0 to 9. Such a methodology therefore, requires a 100 entry memory array, an equivalent programmable logic array (PLA) or combinatorial logic. Another method reduces the number of stored products by performing special tests for digits equal to zero or one. Thus only the combinations of digits which range in value from 2 to 9 are required yielding 64 combinations. These methods employ a linear array of digit multipliers to form each partial product term.
While these methods of multiplication are very simple and relatively easy to implement for decimal multiplication in hardware with a shifter, an adder, a product accumulator, and the like, it will be appreciated that for binary multiplication it takes one cycle to process one multiplier bit to form a partial product, an operation with an n-bit multiplier will take in the order of n cycles to finish. Such a long cycle-per-instruction (CPI) time in the current world of high-speed computing is considered a prohibitive solution. Therefore to achieve shorter CPI for multiply instructions, as mentioned above, additional hardware is expended to calculate the partial products in groups at a time and build the necessary adders to process them simultaneously. This brute force approach does decrease CPI but it also increases the chip area dedicated to the multiplication functions. Adders in particular, are difficult to handle, especially with the area and timing constraints that usually accompany the functional specifications. Many methods have therefore been formulated to decrease adder size through decreasing of partial products by processing the multiplier multiple bits at a time. One of the more popular methods is the Booth recoding algorithm.
The Booth recoding algorithm is a method for reducing the number of partial products produced from a given n-bit multiplier through multiple-bit scanning. It is based on the concept that a string of binary ones, where the least significant bit of value ‘1’ holds a significant value of 2n and the string of ones is z bits long, may alternatively be represented as 2n+z−2n. For example, the string 0b0111 may be represented as 23−20=7, and the string 0b1110 as 24−21=14.
In the previous example, the weight of each bit is equal to 2n where n is the positional value of the relevant bit. The detection of a string of ones is done by overlapping the scanned group of multiplier bits by one bit. Applying this counting method to multiplication, where the scanned number is the multiplier in a 1-bit scan with an overlapping bit, is as simple as giving a bit that is at the end of a string (the least significant bit in the string), detected by a ‘1’ bit whose overlapping bit to the right is a ‘0’, a value of −(2n)*(multiplicand); a bit that is at the beginning of the string (the most significant bit of the z-bit string), detected by a ‘0’ in the position with the overlap bit equal to ‘1’, a value of (2n)*(multiplicand); and a bit that is in the middle of a string of 0's or 1's a value of zero. This is summarized in the table below, where the leftmost bit is the bit in position n of the string and the rightmost bit is the overlap bit needed for string detection. The “Justified Multiplicand Value” column gives the multiplicand-multiple value, the significance of this value may be implied with the position of the relevant scanned bit.
TABLE 1Truth-Table For Radix-2 Booth RecodingMULTIPLICANDJUSTIFIED MULTIPLICAND2-BIT SCANVALUEVALUE000x  0x01(+2n)x+1x10(−2n)x−1x110x  0x
The key to advantageous implementation of the Booth recoding method is in increasing the number of bits that are scanned in a group, thereby decreasing the overall necessary scans of the multiplier as well as the number of partial products and the hardware necessary to combine the partial products. A popular scan-group size is 3 bits, composed of 2 scanned bits with an overlap bit in the least significant position. Its popularity is based on the fact that the necessary multiplicand-multiples needed to realize the recoding is simply 0×, ±1×, and ±2×, all relatively easy to formulate using shifters, inverters, and two's complementation methods to realize all possible multiples, whereas larger scan-group sizes necessitate adders to formulate higher multiples such as ±3×. It is evident that while Booth recoding simplifies the multiplication task, it too is complex and burdensome. Booth recoding, while well suited for its intended purposes, does not exhibit the same benefits for decimal multiplication. Therefore, it would be of benefit in the art to have a multiplication methodology that reduces CPI and complexity of hardware and algorithms to perform multiplication.