Many techniques for multiplying operands in computer systems have been utilized. One technique of multiplying two operands (one a multiplicand 105 and the other a multiplier 110) is shown in FIG. 1A. Each bit in the multiplicand 105 and the multiplier 110 is represented by an “X.” As shown in FIG. 1A, the product of two 17-bit binary operands can be obtained by summing all the rows in the partial products 115. Each bit in the partial products 115 is represented by a “P.” Each row of the partial products 115 is obtained by separately multiplying (“ANDing”) each bit of the multiplier 110 by all the bits in the multiplicand 105. This step is repeated with each bit of the multiplier 110. As each partial product is obtained, the partial product is offset one bit to the left from the preceding partial product. After the partial products 115 are obtained, as shown in FIG. 1A, they are summed to generate the final product 120. An “S” represents each bit in the final product 120. This multiplication technique is known as longhand multiplication. A numeric example of the longhand multiplication of an 8-bit multiplier and a 12-bit multiplicand is shown in FIG. 1B. As shown in FIG. 1B, each row of the partial products is either a copy of the multiplicand or is zero.
The generation of the partial products can be rapidly performed by an execution unit within a computer system because the partial products can be generated in parallel. However, summing the partial products may be relatively slow.
As a result, some multiply execution units utilize a carry save adder to rapidly sum partial products. FIG. 2 presents a carry save adder column that can sum the seventeen partial product bits of column 125 (FIG. 1A). Various terms, such as Wallace tree and Dadda tree, are utilized by those of skill in the art for a carry save adder column. The carry save adder column of FIG. 2 includes fifteen full adders 201–215 that are configured into six levels. As shown in FIG. 3, full adder 300 has three inputs: Carry In, Input A, and Input B. In addition, the full adder 300 has two outputs: Carry Out and Sum. While there are many methods of constructing full adders, the full adder 300 includes a first XOR gate 305, a second XOR gate 310, and a multiplexer 315. Full adders are known by those of skill in the art.
The carry save adder column shown in FIG. 2, utilizes full adders 201–215 to rapidly sum the 17 partial product bits of column 125. This carry save adder column receives the 17 partial product bits at the input “branches” of the carry save adder column via full adders 201–205 and 209. The 17 partial product bits are reduced to just two bits 260 and 261, which are input into a carry look-ahead adder (not shown) at the output of the carry save adder column. The carry save adder column receives carry bits from other carry save adder columns via nodes, such as nodes 225, 226, 227, and 228. Similarly, the carry save adder column outputs carry bits to other carry save adder columns via nodes, such as nodes 230, 231, 232, and 233. Carry save adders and carry look-ahead adders are known by those of skill in the art.
As is known in the art, fast integer multipliers can be constructed that utilize carry save adders, full adders, 4 to 2 compressors, and 5 to 3 compressors. These multipliers can rapidly multiply operands.
As discussed above, a significant amount of the time required to multiply two values is utilized to add the partial products. Thus, if there are fewer partial products, then multiplication can be performed more rapidly. One method of reducing the number of partial products is known as Booth encoding. Booth encoding can reduce the number of partial products almost in half. As shown in the following table, Booth encoding generates a partial product for each pair of multiplier bits instead of a partial product for each bit in the multiplier. For example, as shown in the table below, if the first pair of multiplier bits are “00,” then the first partial product would be zero. Similarly, if the third and fourth bits of the multiplier were “01,” then the second partial product would be equal to the multiplicand.
2 BitsPartial Product Value000011 * multiplicand102 * multiplicand (multiplicand, left-shifted 1 bit)113 * multiplicand(sum of multiplicand and multiplicand, left-shifted 1 bit)The generation of the values 0*multiplicand, 1*multiplicand, and 2*multiplicand can be rapidly performed by a multiplexer. However, rapid generation of the value 3* multiplicand is more difficult because an addition, which takes a significant amount of time, needs to be performed. As a result, some fast multiply execution units represent the value of 3*multiplicand as 4*multiplicand −1*multiplicand, with the −1*multiplicand value being used with the current two bits and the 4*multiplicand being used with the next two bits as an extra value of 1*multiplicand. However, if the next two bits have a value of 2*multiplicand, then an extra 1*multiplicand will result in a value of 3*multiplicand, with the same difficulty. Thus, such fast multiply execution units may represent the value of 2*multiplicand as 4*multiplicand −2*multiplicand, with the −2*multiplicand value being used with the current two bits and the 4*multiplicand being used with the next two bits as an extra value of 1*multiplicand. As shown in the following table, such fast multiply execution units, in addition to considering the two bits of the multiplier, also consider the most significant bit of the previous two bits.
MSB of2 BitsPrevious Two BitsPartial Product Value000  0001+1 * multiplicand010+1 * multiplicand011+2 * multiplicand100−2 * multiplicand(complement the multiplicand, left-shifted1 bit)101−1 * multiplicand (complement themultiplicand)110−1 * multiplicand111  0
FIG. 4 presents a numeric example of multiplying an 8-bit multiplier by a 12-bit multiplicand using Booth encoding. As shown in FIG. 4, the number of partial products is reduced from 8 to 5. Thus, utilizing Booth encoding can reduce the amount of circuit needed to implement the multiplier execution unit and increase the speed of multiplication. As is known in the art, the above partial product values can be quickly generated by a multiplexer and inversion. There are several methods, known to those skilled in the art, of dealing with the sign bits of the partial products in multipliers that utilize Booth encoding. Each of these methods can be accommodated in fast multipliers.
An integer multiplication treats each row of a partial product as a binary representation of the integer multiplicand 105. Therefore, integer multiplication adds the bits of partial sums. However, another type of multiplication, XOR (exclusive or) multiplication, combines the bits by XORing (instead of adding) the bits. In other words, XOR multiplication treats each row of a partial product as a bit string, and the bits along a vertical column of the partial products are XORed together to obtain the result. FIG. 5 presents a numeric example of an XOR multiplication of an 8-bit multiplier and a 12-bit multiplicand.
XOR multiplication is utilized in the implementation of the arithmetic operations for the binary polynomial field Elliptic Curve Cryptography (ECC). Elliptic curves have been found to provide versions of public-key cryptographic methods that, in some cases, are faster and use smaller keys than other cryptographic methods, while providing an equivalent level of security.
Thus, a need exists for a multiply execution unit, which can efficiently perform traditional multiplication as well as XOR multiplication. By enabling one execution unit to perform both types of multiplications, much less circuit is needed to implement one such combined unit compared to two separate units, one for integer and another for XOR. This helps reduce the power consumption of the multiplier execution unit.