Large-number multiplication is used in numerous computer algorithms known in the art. Common uses of large-number arithmetic include public key cryptography and Montgomery multiplication, where numerous multiplication operations of very large numbers (with an order of a thousand bits each) are implemented.
A “large number” is an integer number of N bits used by a processor with registers or word sizes of W bits in width, where N>2×W. The term “word size” refers to the number of bits in a single precision register, or the memory width of the processor in use. Common processors with a word size of W bits are capable of multiplying two words of W bits and then storing the result in a double-width register of 2×W bits. If the size of the operand to be multiplied is larger then W bits, a dedicated multiplication algorithm is required.
FIG. 1 is a demonstration of a prior art multiplication procedure 10 for multiplying two multiple-digit numbers using pen and paper. Each number is composed of two hexadecimal digits. The first number is 3B (indicated by reference numeral 12) and the second number is CA (indicated by reference numeral 14). The multiplication procedure 10 starts with multiplying only the Least Significant Digit (LSD) of the second number 14 with the two digits of the first number 12 to create a first interim result. The first interim result is indicated by reference numeral 16.
Next, a similar operation is performed with multiplying the Most Significant Digit (MSD) of the second number 14 with the two digits of the first number 12 to create a second interim result. The second interim result, indicated by reference numeral 18, is written below the first interim result 16 shifted one digit to the left.
After the multiplication procedure 10 is completed, the sum of the first interim result 16 and the second interim results 18 is the multiplication result 2E8E (indicated by reference numeral 20).
Multiplying hexadecimal numbers as shown herein is used for clarity only. Computer algorithms known in the art apply multiplication on binary numbers.
FIG. 2 is a flow chart of a prior art computerized multiplication algorithm 30 applied for the multiplication of the two multiple-word numbers of FIG. 1. When using a computer, each numeral digit is represented by a W-bit computer word. The term digit refers to a portion of a large number that is the size of a computer word.
In general, the computerized multiplication algorithm 30 is applied in a similar manner to the prior art procedure 10 (see FIG. 1). However, according to the computerized multiplication algorithm—the result of each interim multiplication is added to a result vector as the multiplication algorithm proceeds, while according to the multiplication procedure 10 of FIG. 1—the partial results are stored separately before they are summed up at the final stage.
At the initial step 32, two multiplicand vectors X and Y and a result vector Z for holding the result are provided. Vectors X, Y and Z are of W-bit words. The lengths of input vectors X and Y are max_x and max_y, respectively. A double-width register r (composed of 2×W bits) is used to temporarily hold the multiplication result.
At step 34, the result vector Z is cleared.
At step 36, the internal variables i, j, c1, and c2 of the two multiplicand vectors X and Y are cleared, wherein i is the first number digit index, j is the second number digit index, c1 is the high word from the previous multiplication operation, and c2 is the carry from the previous addition operation.
The following steps 38, 40, 42, 44, 46, and 48 comprise the main multiplication loop. At step 38, two digits are multiplied, adding the previous multiplication high word c1, and the carry from the previous addition operation c2. The result is temporarily stored in the double-width register r.
At step 40, the result r of the high-word multiplication is stored in c1.
At the next step 42, the addition carry is calculated and stored in c2.
At step 44, the result vector element zi+j is updated by adding to it the multiplication result. Since the width of each result vector element is W, only the lower word of r is added to element zi+j.
At step 46, the X multiplicand index i is incremented.
At the next step 48, it is determined whether the value of i is greater than the length of input vector X. In the affirmative case, the computerized multiplication algorithm proceeds to step 50. However, in the negative case, the computerized multiplication algorithm returns to step 38 and repeats the main multiplication loop.
At step 50, the next result word is updated by adding to it the values of c1 and c2.
At step 52, the second multiplicand index j is incremented, and the internal variables i, c1, and c2 are cleared.
At the next step 54, it is determined whether the internal variable j is greater than the length of input vector Y. In the affirmative case (i.e. after performing max_x times max_y multiplication operations), the computerized multiplication algorithm is terminated at step 56 and the result vector Z holds the final multiplication result. However in the negative case, the computerized multiplication algorithm restates the internal loop from step 38 by multiplying the first multiplicand with the next word of the second multiplicand.
The prior art computerized multiplication algorithm provides multiplication of large numbers, which are based on unsigned multiplication operations. In other words, the input vectors X, Y, as well as the result vector Z all contain positive numbers of W bits per digit (there is no sign bit).
However, several Digital Signal Processors (DSPs) known in the art have arithmetic support for signed operations only. An example of a prior art DSP that does not support unsigned arithmetic (i.e. is limited to signed arithmetic only) is the ZSP200 DSP for example, available from LSI Logic Corporation.
In such types of DSPs, the Most Significant Bit (MSB) in each word is always the sign bit. Using the prior art computerized multiplication algorithm previously described as is and performing signed multiplication operations generates an incorrect result. For example, if the word size is eight bits (W=8), the result of an unsigned multiplication operation of the numbers 0xFF×0xFF is 0xFE01 (255×255 is 65,025). If only signed operations are supported, then the number 255 stands for −1. The multiplication operation of the numbers −1×−1 has a result of +1, which is different from the result that was generated using an unsigned operation.
Compilers used with processors known in the art that are capable of performing signed operations only solve the problem described above by performing a sequence of operations that translate the signed multiplication result into an unsigned result.
A conversion process known in the art is typically performed as follows: