Finite fields play an important role in digital communication system, such as applications of cryptographic scheme and error correction codes. Comparing with normal number system, the finite field has many special properties so that the key operations, finite field addition and multiplication are always implemented individually in hardware. Since finite field addition can be directly implemented by XOR gates with low hardware and time complexity, the bottleneck is always finite field multipliers.
There are three kinds of architectures of finite field multipliers: serial, hilly-parallel and partially-parallel architecture. Serial architecture provides the lowest hardware complexity but needs to cost multiple clock cycles for a multiplication operation. However, since operation speed of peripheral hardware had been increased than ever and not all of the multiplication operations need a very large number of iterative calculating steps, the serial architecture is still popular in some applications.
In some applications, the key operation of a Multiply Accumulate (MAC) is the combination of several finite field additions and multiplications, such as E=A×B+C×D, where A, B, C, D and E are sets of elements in the finite field. In detail, A includes m elements, a0, a1, a2 . . . and am−1. Similarly, B includes b0, b1, b2, . . . and bm−1, C includes c0, c1, c2, . . . and cm−1, D includes d0, d1, d2 . . . and dm−1, and E includes e0, e1, e2 . . . and em−1. In this case, conventionally, two finite field multiplications and one finite field addition as shown in FIG. 1 are requested. One finite field multiplier shown on the left deals with A×B while the other one on the right processes C×D. It is obvious that each multiplier has m−1 cell A and a cell B. Both cell A and cell B have an AND gate, an XOR gate and a register. The only difference is cell B doesn't receive the data fed back from itself. The connections of the dashed arrows are defined by the primitive polynomial for adopted GF(2m). There are also m XOR gates formed as a finite field adder for operating A×B+C×D.
In the design, the area cost is two finite field multipliers and one finite field adder. After calculation, the MAC includes 2 m AND gates, 3 m XOR gates and 2 m registers. The critical path of this design is one multiplier and one XOR gate. U.S. Pat. No. 7,082,452, titled “Galois field multiply/multiply-add multiply accumulate”, provides a parallel architecture to achieve a fast calculating speed for the same operation. However, its hardware complexity of '452 is too high to be adopted in some area-efficient design.
Here, the inventor discloses a serial architecture for MAC with much lower hardware complexity but having similar performance as the conventional MAC as shown in FIG. 1. Namely, fewer elements, such as XOR gates and registers, are required to achieve the same operation comparing with the conventional MAC. Therefore, the present invent has advantage of lower area cost.