Multiplication hardware is usually adapted to carry out natural multiplication (the normal arithmetic one learns in grade school), but on binary numbers. In natural multiplication two operands A and B are multiplied together to form a product C=A·B, where A, B and C are represented by binary digits ai, bj and ck equal to 0 or 1:A=(an−1, . . . , a1, a0)=SUMi(ai·2i);B=(bn−1, . . . , b1, b0)=SUMj(bj·2j);C=(c2n−1, . . . , c1, c0)=SUMk(ck·2k).Here, the indices i, j and k represent the bit significance or “weight” of the particular digit. (Similar number representations, such as twos-complement or ones-complement, are commonly used to represent negative integers, as well as the mantissa of real numbers. Multiplication using these other number representations is likewise similar, with appropriate modifications.)
In parallel multiplier architectures, the product is typically formed as a sum of cross-products. The partial product of two operand bits is equivalent to a logic AND operation and can be carried out in circuit hardware by using AND gates. The SUM of two partial product bits of equal weight produces a sum term of the same weight and a carry term of next higher weight, where the sum term is equivalent to a logic XOR operation and the carry term is equivalent to a logic AND operation:x+y=carry, sum=AND(x,y), XOR(x,y).Typically, hardware adders come in two main types, full-adders which add together three input bits, and half-adders which add together two input bits. The input bits might be either partial product bits, sum terms output from another adder, or carry terms. All of the input bits of whatever origin, including “carry” input bits, have exactly the same logic contribution to the adder outputs and are normally treated as being equivalent with respect to the result. (Note however, that standard cell implementations of adder circuits often give carry inputs privileged timing in the adder circuit's construction in order to minimize propagation delays and excessive switching in the overall adder array architecture.) Both types of adders produce a sum term and a carry term as outputs.
In natural multiplication, the carry terms are propagated and added to the sum terms of next higher weight. Thus, the natural product C is:
      C    =                  SUM                  i          ,          j                    ⁡              (                              a            i                    ·                      b            j                    ·                      2                          i              +              j                                      )                                     ⁢          =                                    SUM            k                    ⁡                      (                                          (                                                      SUM                                                                  i                        +                        j                                            =                      k                                                        ⁡                                      (                                          AND                      ⁡                                              (                                                                              a                            i                                                    ,                                                      b                            j                                                                          )                                                              )                                                  )                            ·                              2                k                                      )                          .            Parallel natural multiplier circuits come in a variety of architectures, differing mainly in the manner of arranging the partial product adder arrays.
The architectures of Wallace (from “A Suggestion for a Fast Multiplier”, IEEE Trans. on Electronic Computers, vol. EC-15, pp. 14-17, February 1964) and Dadda (from a paper presented at the Colloque sur l'Algèbre de Boole, Grenoble France, January 1965) are similar. The basic structure disclosed by L. Dadda is seen in FIG. 1. The array of partial products is represented as dots aligned in zone A in vertical columns according to their weights. The number of partial products of a given weight can vary from 1 to n for two n-bit operands. Summing the partial products of a given weight is carried out by binary counters, represented in the figure by diagonal lines. The term “binary counter” is used by Dadda and elsewhere in this document in the sense that, for a given number of input lines, it produces a binary output representing the total number or “count” of ones on those inputs. (This is different from the usual sequential counter, which produces a series of incremented outputs over time.) The summing of the partial products is divided into two main steps, in which a first step (subdivided into several cascaded stages) reduces the partial products to a set of two numbers, and a second step comprises a single carry-propagating adder stage. The cascaded stages of the first step are shown in the figure as zones B through D. The size of the counter depends on the total number of terms of a given weight which are to be counted. For example, in zone B, column 5, there are 5 partial products of weight 24 to be added (counted), which together form a 3-bit sum of weights 26, 25, 24, respectively. Thus, there are several carry terms of different weights which are propagated to the next counting stage or zone. Zones C and D apply the same principle to the outputs of the preceding zone. The output of the zone D counters is made up of two lines only. These are handled with fast adders in the second main step (in zone E) to give the natural product. Other parallel natural multipliers may use various kinds of tree structures of full-adders (or even more complex adder circuits) to rapidly reduce the partial products to a final product.
Other types of algebra have their own form of multiplication. One type commonly used in generating error-correcting codes, and more recently in elliptic curve cryptography systems (see, for example, U.S. Pat. No. 6,252,959), generates multiplication products in a finite (Galois) field. Different fields may be used, but the most common applications employ either prime number fields GF(p) or binary fields GF(2N). Error-correcting code applications, such as Reed-Solomon code generation, typically operate repeatedly on small size words, e.g. of 8 bits, and thus might use multiplication on GF(256). Elliptic curve applications typically operate on much larger blocks with word widths of 160 bits or more. Often in either of such applications, using a polynomial representation, the product is defined as a polynomial product, subsequently reduced by residue division by an appropriate irreducible polynomial. Dedicated hardware architectures have been constructed to implement finite field multiplication.
Over GF(2N), the elements of a number can be represented as either as n-uples (matrix representation) or as polynomials with n coefficients (polynomial representation):
  A  =            (                        a                      n            -            1                          ,        …        ⁢                                  ,                  a          1                ,                  a          0                    )        =                                        a                          n              -              1                                ⁢                      x                          n              -              1                                      +        …        +                              a            1                    ⁢                      x            1                          +                              a            0                    ⁢                      x            0                              ⁢                          ⁢                          =                        SUM          i                ⁡                  (                                    a              i                        ⁢                          x              i                                )                    The ai are member of GF(2), i.e. can be 0 or 1. The additive and multiplication laws over GF(2) are respectively the XOR and AND logic operations. The addition of two GF(2N) numbers is defined as polynomial addition, that is addition of the coefficients of identical degree or weight:C=A+B=SUMi(XOR(ai, bi)xi)The multiplication of two GF(2N) numbers is defined as polynomial multiplication, modulo a specific irreducible polynomial P:
      C    =                  A        ·        B            =                                    (                          A              *              B                        )                    ⁢          mod          ⁢                                          ⁢          P                ⁢                                  ⁢                                  =                                            SUM              k                        ⁡                          (                                                                    XOR                                                                  i                        +                        j                                            =                      k                                                        ⁡                                      (                                          AND                      ⁡                                              (                                                                              a                            i                                                    ,                                                      b                            j                                                                          )                                                              )                                                  ⁢                                  x                  k                                            )                                ⁢          mod          ⁢                                          ⁢          P                      ,with k from 0 to N−1. For notation, A*B represents the polynomial product (not reduced modulo P), whereas A·B represents the product of two GF(2N) numbers. A*B is a polynomial of degree 2N−2 and thus is not a member of GF(2N). A·B is a member of GF(2N).
Comparing polynomial addition and multiplication having coefficients in GF(2) to natural addition and multiplication, we observe that ak xk (polynomial term of degree k) and ak 2k (natural number bit of weight k) play a similar role in addition and multiplication but with some difference. The polynomial addition with coefficients in the finite field GF(2) is similar to that for natural addition, except that the sum of terms of identical degree does not provide any carry for adjacent terms in the case of polynomial addition, while the natural addition of identical weight terms does provide a carry to the next higher weight. The polynomial multiplication with coefficients in the finite field GF(2) is also similar to that for natural multiplication, except that the sum of partial products of identical degree does not generate carries for the adjacent degrees in the polynomial multiplication case, while the natural sum of partial products of the same weight terms does provide a carries to the next higher weight. Finally, we point out that the least significant bit of the natural sum of n bits is XOR of these bits, just as in the polynomial case.
In U.S. Pat. No. 4,918,638, Matsumoto et al. describe a finite field multiplier for obtaining a product in GF(24) for use in generating error correcting codes. After performing binary multiplication, a separate polynomial generator block reduces the product with division by a generator polynomial g(x)=x4+x+1. FIGS. 5 and 9 of that patent show binary multiplier arrays for performing the finite field multiplication. AND gates are used to form the partial products, while XOR gates are used to perform bit addition on the partial products of the same weight. The multiplier is not constructed to perform natural multiplication, only GF(24) finite field multiplication.
An object of the present invention is to provide parallel multiplier architectures that are capable of delivering both a natural multiplication product and also a polynomial multiplication product with coefficients over GF(2), thus helping to accomplish finite field multiplication in GF(2N) for any values of N≧1.