Index Terms: Coding theory, finite field, power-sum, cellular- array circuit, VLSI architecture.
References:
1! T. R. N. Rao, and E. Fujiwara, Error-Control Coding for Computer Systems. Prentice-Hall, Englewood Cliffs, N.J., 1989. PA1 2! R. E. Blahut, Theory and Practice of Error Control Codes. Addison-Wesley, Reading, Mass., 1983. PA1 3! W. W. Peterson, and E. J. Weldon, Jr., Error-Correcting Codes. 2nd ed., The MIT Press, Cambridge, Mass., 1972. PA1 4! S. Lin, and D. J. Costellor, Jr., Error Control Coding. Prentice Hall, Englewood Cliffs, N.J., 1983. PA1 5! S. W. Wei, and C. H. Wei, "High speed decoder of Reed-Solomon codes," IEEE Trans. Commun., vol.COM-41 , no.11, pp. 1588-1593, November 1993. PA1 6! S. R. Whitaker, J. A. Canaris, and K. B. Cameron, "Reed Solomon VLSI codec for advanced television," IEEE Trans. Circuits and Systems for Video Technology, vol.1, No.2, pp.230-236, June 1991. PA1 7! S. W. Wei , and C. H. Wei, "A high-speed real-time binary BCH decoder," IEEE Trans. Circuits and Systems for Video Technology, vol.3, no.2, pp. 138-147, April 1993. PA1 8! E. R. Berlekamp, "Bit-serial Reed-Solomon encoders," IEEE Trans. Inform Theory, vol. IT-28, pp. 869-874, 1982. PA1 9! C. C. Wang, T. K. Truong, H. M. Shao, L. J. Dentsch, J. K. Omura, and I. S. Reed, "VLSI architectures for computing multiplications and inverses in GF(2.sup.m)," IEEE Trans. Comput., vol. C-34, pp. 709-716, 1985. PA1 10! C. -S. Yeh, Irving S. Reed, and T. K. Truong, "Systolic multipliers for finite fields GF(2.sup.m)," IEEE Trans. Comput., vol. C-33, pp.357-360, 1984. PA1 11! B. A. Laws, Jr., and C. K. Rushforth, "A cellular-array multiplier for GF(2.sup.m)," IEEE Trans. Comput., vol. C-20, pp. 1573-1578 1971. PA1 12! H. Okano, and H. Imai, "A construction method of high-speed decoders using ROM's for Bose-Chaudhuri-Hocquenghem and Reed-Solomon codes," IEEE Trans. Comput., vol. C-36, pp. 1165-1171, 1987. PA1 13! K. Araki, I. Fujita, and M. Morisue, "Fast inverter over finite field based on Euclid's algorithm," Trans. IEICE, vol. E-72, pp.1230-1234, November 1989. PA1 14! P. A. Scott, S. J. Simmons, S. E. Tavares, and L. E. Peppard, "Architectures for exponentiation in GF(2.sup.m)," IEEE J Selected Areas in Commmun., vol.6, No.3, pp.578-586, April 1988. PA1 15! C. C. Wang, and D. Pei, "A VLSI design for computing exponentiations in GF(2.sup.m) and its application to generate pseudorandom number sequences," IEEE Trans. Comput., vol. C-39, No.2, pp. 258-262, February 1990. PA1 16! A. M. Odlyzko, "Discrete logarithms in finite fields and their cryptographic significance," in Adv. Cryptol., Proc. Eurocrypt'84, pp.224-314, Paris, France, April 1984.
Arithmetic Operations based on Finite Field GF(2.sup.m) have recently called significant attention because of their important and practical applications in the areas of computers and communications, such as the forward error-correction codes (recommended references 1!-4! ). To configure an error-correcting decoder with a high decoding speed and low circuit complexity, well designed basic arithmetic circuits in association with a powerful decoding algorithm are required. Therefore improvements in the design of finite field arithmetic circuits that yield lower circuit complexity, shorter computation delay, and higher computation speed is an extensive research topic in finite field arithmetic. Addition, multiplication, exponentiation, multiplicative inverse, and division are the most important arithmetic operations for error- correcting codes. For example, the most popular decoding procedure for a quat-error-correcting binary primitive BCH code consists of three main steps (recommended references 2!-4! ): (i) calculating the syndrome values S.sub.i, i=1, 3, 5, 7 from the received word; (ii) determining the error-locator polynomial .sigma.(x)=x.sup.4 +.sigma..sub.1 x.sup.3 30 .sigma..sub.2 x.sup.2 +.sigma..sub.3 x+.sigma..sub.4 from the syndrome values, where .sigma..sub.1 =S.sub.1, .sigma..sub.2 ={{S.sub.1 S.sub.7 +(S.sub.1).sup.7 !}+{S.sub.3 S.sub.5 +(S.sub.1).sup.5 !}}/{{S.sub.3 S.sub.3 +(S.sub.1).sup.3 }+{S.sub.3 S.sub.5 +(S.sub.1).sup.5 }}, .sigma..sub.3 =(S.sub.1).sup.3 +S.sub.3 +S.sub.1 .sigma..sub.2, and .sigma..sub.4 ={S.sub.5 +S.sub.3 (S.sub.1).sup.2 !+S.sub.3 +(S.sub.1).sup.3 !.sigma..sub.2 }/S.sub.1 2!; (iii) solving for the roots of .sigma.(x), which are the error locators. Such a way to determine the coefficients of the error locator polynomial, .sigma..sub.2, .sigma..sub.3, and .sigma..sub.4, may require additions, multiplications, exponentiations, and inversions (or divisions). One can obviously see from this example that multiplication is one of the most frequently used field arithmetic operations. However, performing some operations, e.g. exponentiation, using ordinary multiplication might be inefficient. For instance, the above example of quat-error-correcting binary primitive BCH code requires several multiplications to calculate S.sub.7 +(S.sub.1).sup.7 ! in .sigma..sub.2, but requires only two AB.sup.2 +C operations to obtain the same result (that is, S.sub.1 S.sub.1 (S.sub.1).sup.2 +0!.sup.2 +S.sub.7). It is confirmed by these references that the AB.sup.2 +C operation is an efficient tool to implement such a computation. As will be discussed in two divisional applications, the AB.sup.2 +C operation can also be used to execute exponentiations, inversions, and divisions efficiently. The AB.sup.2 +C operations, exponentiations, inversions, and divisions are also frequently used in decoding other binary BCH and Reed-Solomon (RS) codes (recommended references 5!-7!).
Many architectures over GF(2.sup.m) have already been developed upon various bases, such as a bit-serial multiplier that uses a dual basis (recommended reference 8!), a multiplicative inverter that uses a normal basis (recommended reference 9!), and a systolic multiplier that uses a standard basis ( recommended reference 10!). The finite field operations of the first two types need basis conversion, whereas the third one does not. Each type of finite field operation possesses distinct features that make it suitable for specific applications. For decoders used in computers and digital communications, the standard basis is still the most frequently used basis. Therefore, we confine our attention to the computations over the standard basis alone.
It is difficult to design a finite field arithmetic circuit having low circuit complexity while simultaneously maintaining a high computation speed. In general, a trade-off between computation speed and circuit complexity is often necessary. Designing a standard basis circuit that performs only addition is quite simple. It can be implemented by using m exclusive-OR (XOR) gates (recommended reference 1!-2!). The first parallel-in-parallel-out multiplier architecture was the static cellular-array product-sum multiplier, presented by Laws in 1971 (recommended reference 11!). The circuit for it consisted of m.sup.2 identical cells, each consisting of two 2-input AND gates and one 3-input XOR gate. The cellular-array multiplier requires 2m logic gate delays to perform a multiplication. In order to improve the computation speed, Yeh presented a parallel-in-parallel-out systolic product-sum multiplier in 1984 (recommended reference 10!); this circuit is composed of m.sup.2 identical cells, each of which consisted of two AND gates, two 2-input XOR gates and seven latches. For successive operations, the cellular-array multiplier still requires 2m gate delays to perform each multiplication, whereas the systolic multiplier needs only one cell time unit of two gate delays to perform a multiplication subject to a computation delay of 2m cell time units at the beginning. After Yeh's systolic multiplier was presented, it was believed that the cell time unit of two logic gate delays could not be further improved upon (recommended reference 10!).
In principle, division in a finite field may be performed using a multiplication and a multiplicative inverse, i.e., A/B=AB.sup.-1, in which A and B are arbitrary elements of GF(2.sup.m). A multiplicative inverse can be implemented using read-only memory (ROM) (recommended reference 12!), Euclid's algorithm (recommended reference 13!), or a number of consecutive multiplications (recommended reference 9!). Most of the architectures for computing multiplicative inverses have been developed upon the normal basis. A major reason for the development is that the squaring operation in the normal basis is only a simple cyclic shift (recommended reference 9!). Computation of exponentiation resembles the computation of multiplicative inverse. Exponentiation can also be implemented using ROM and successive multiplications. Several architectures for computing exponentiation in GF(2.sup.m) have been developed upon the standard as well as the normal bases (recommended reference 14!-15!).