This application claims priority from Canadian Patent Application Serial No. 2,291,596, filed Nov. 30, 1999.
This invention relates to the field of arithmetic circuits and more particularly to a system and process for multiplying a pair of numbers using a unique sequence of operations in a microprocessor.
As is well known, the multiplication of two numbers may be laboriously performed by successively adding a multiplicand a multiplier number of times. One improvement of this method is to add a list of shifted multiplicands, each of which is computed according to the digits of the multiplier. In the case of a binary representation of a number a typical multiplication algorithm multiplies two N-bit words and produces a 2N-bit word product by simply adding each shifted multiplicand as it is created to a partial product. It may be noted, that when multiplying an N-bit number with an M-bit number in a computer, the resulting product will take at most N+M bits.
Computer microprocessors include parallel multiplier circuits so that multiplication of multi-digit numbers can be performed very quickly. This is true as long as the number of digits or the word width of the numbers to be multiplied does not exceed the number of bits that can be processed in parallel by a multiplier circuit.
In many applications such as for example in data encryption, the encryption keys are generally in the order of 1024-bits wide. In order to multiply numbers of this magnitude or larger, the binary representation of the number is sub-divided into successive segments of equal number binary digits each of which are successively processed. The successive results are then concatenated or combined to produce the final result. While this method requires additional clock cycles and control circuitry, the complexity of the multiplier circuit remains the same.
While much effort has been expended in developing efficient algorithms for multiplying two arbitrary numbers, an increasing number of applications now require frequent squaring operations to be performed. For example cryptographic processors normally perform multiple squaring operations during the encryption or decryption process. A few techniques have been proposed to efficiently square large numbers using fixed width multiplier circuits.
For example, in U.S. Pat. No. 5,195,052, which issued Mar. 16, 1993, there is described a method and corresponding circuit for performing exponentiation. There is no description in the subject patent of a specific or optimized method for squaring numbers. The patent treats squaring as another example of a general multiplication operation.
On the other hand, in U.S. Pat. No. 5,724,280 there is described an accelerated booth multiplier using interleaved operand loading. The patent describes an architecture for multiplying large word length operands and more specifically an apparatus which implements the booth multiplication algorithm in a faster manner than currently used multipliers by interleaving the loading of the operands with the re-coding and partial product accumulation operations.
The patent describes an efficient method for squaring the sum of two numbers (A+B)2. The multiplier performs squaring operations by shifting the first operand value (A) by one bit, to double that value (2A) prior to multiplying by a second operand (B) to form the product to (2AB). This term is then accumulated with A2 and B2. Accordingly, the multiplication scheme of this patent is restricted to a specific sequence of operations in order to achieve a squaring of a sum of two numbers. The patent does not discuss or suggest how this is applied to squaring large numbers that exceed the operand register size of the multiplier.
Accordingly, there is a need for a system and method for optimizing a multiplication of large number operands in a computer processor.
In accordance with this invention there is provided a method for computing an intermediate result in squaring a number using a multiplier circuit of predetermined operand size, the method comprising the steps of:
(a) representing a number to be squared as a vector of binary digits;
(b) grouping the vector into successive segments each having a length of the predetermined operand size;
(c) multiplying a first segment value by a second segment value to generate a first product value;
(d) halving a second product value to generate a halved second product value;
(e) accumulating the first product value with the halved second product value to generate an accumulated value; and
(f) doubling said accumulated value to generate said intermediate result.