1. Field of the Invention
The present invention relates to electronic circuits, and more specifically to arithmetic circuits having built-in self testing for use with the residue number system.
2. Description of Related Art
Power consumption is now a very important consideration in integrated circuit design. This has compelled circuit designers to consider reducing power consumption through changes in many different levels of the design process, such as the system, technology, algorithm, physical, and circuit levels. For example, system level approaches for reducing power consumption include power supply voltage scaling, clock gating, and subsystem sleep (or power down) modes. Technology level techniques include using dynamic threshold MOSFETs, and algorithm level techniques include using alternate number systems and state encoding. Further, physical level methods include transistor reordering, and circuit level methods include self-timed asynchronous approaches and glitch reduction. The ultra-low power circuits of the future will have to employ several of these approaches because none alone can achieve the power reduction goals for the next decade.
While all of the techniques described above advantageously reduce power consumption, many of them have a deleterious side effect of reducing the speed of the circuit. For example, supply voltage scaling lengthens the system clock period if other factors such as technology and drive strength are kept the same. For this reason, designers now consider the delay-power (DP) product of a circuit as the crucial factor in low power circuit design. One system level design approach that is currently being investigated due to of its potential for significantly reducing the DP product is the One-Hot Residue Number System (OHRNS). For example, the OHRNS is being considered for use in the adaptive FIR (finite impulse response) filters and Viterbi detectors of hard disk drive read channels, in the endecs of wireless telecommunication integrated circuits, and in the adaptive filters of image processing integrated circuits.
The Residue Number System (RNS) is an integer number system in which the basic operations of addition, subtraction, and multiplication can be performed quickly because there are no carries, borrows, or partial products. This allows the basic operations to bc performed in a single combinational step, digit-on-digit, using simple arithmetic units operating in parallel. However, other operations such as magnitude comparison, scaling (the RNS equivalent of right shifting), base extension (the RNS equivalent of increasing the bit width), and division are slower and more complicated to implement. Thus, RNS is most widely used in applications in which the basic operations predominate, such as digital signal processing.
The RNS representation of an integer X is a number of digits, with each digit being the residue of X modulo a specially chosen integer modulus. In other words, X is represented as the vector of its residues modulo a fixed set of integer moduli. In order to make the RNS representation of each integer unique for all non-negative values less than the product M of the moduli, the moduli are chosen to be pairwise relatively prime (i.e., the smallest single number into which all divide evenly is equal to the product of the moduli). Letting m1 denote the ith modulus, the RNS representation of X is given by X˜ (x1, x2, . . . , xn), where x1=X modulo m1 and is known as the ith residue digit of the RNS representation of X. Table 1 shows the representation of the integers 0 to 2430 in an RNS in which m1=11, m2=13, and m3=17 (“an 11, 13, 17 RNS representation”).
TABLE 1 Integer       RNS digit RNS digit RNS digitXx11x13x17 2430      10        12            162429 91115. . .1986218751176401653161542151431141320131211212110111110101010 9999 8888 7777 6666 5555 4444 3333 2222 1111 0000
As an example, for the natural number 19, the x11 digit is 19 mod(11)=8 (i.e., 19÷13=1 remainder 8), the X13 digit is 19 mod(13)=6, and the X17 digit is 19 mod(17)=2. Each RNS digit is determined without reference to any other RNS digit, and no RNS representation repeats in the range from 0 to 2430. Negative integers can be represented by limiting the represented range to an equal (or substantially equal) number of positive and negative numbers. The representation of the range from −1215 to 1215 in the 11, 13, 17 RNS representation is shown in Table 2. No separate sign is associated with the RNS representation, and the sign of the represented integer cannot be determined from any less than all of its RNS digits.
TABLE 2  Integer     RNS digit    RNS digitRNS digitXx11x13x17      1215        5           6   8121445744443333222211110000−1101216−291115−381014−47913−12147810−1215679
In the RNS, the basic operations of addition, subtraction, and multiplication are performed in digit-parallel fashion, modulo m1. Thus, if operands X and Y have RNS representations of X ˜ (x1, x2, . . . , xn) and Y(y1, y2, . . . , yn), the result Z has an RNS representation of Z ˜ (x1∘y1, x2∘y2, . . . , xn∘y1), where “x1∘y1” represents any of the basic operations performed on the two RNS digits modulo m1. More specifically, the corresponding RNS digits of the two numbers are added, subtracted, or multiplied, and then the proper modulo operation is performed on each to produce the RNS digits of the result.
For example, in the 11, 13, 17 RNS representation of Table 1, 4+15 gives (4, 4, 4)+(4, 2, 15) or (4+4 mod(11), 4+2 mod(13), 4+15 mod(17)), which equals (8, 6, 2) or 19. Similarly, 19−15 gives (8−4 mod(11), 6−2 mod(13), 2−15 mod(17)), which equals (4, 4, 4) or 4. Further, 6×3 gives (6×3 mod(11), 6×3 mod(13), 6×3 mod(17)), which equals (7, 5, 1) or 18. Because all individual operations are performed on each RNS digit independently and without reference to any other RNS digit, the operations can be performed completely in parallel. Thus, each of the basic operations can be performed quickly and efficiently, especially when all of the moduli are relatively small integers.
In electronic circuit implementations, addition is the fundamental RNS operation and subtraction is performed by adding the additive inverse of the subtrahend. Multiplication is also performed using addition by using of the following properties. Any prime modulus p has at least one primitive root, which is an integer α of order p-1 under multiplication. In other words, the primitive root is an integer α whose successive powers, taken modulo p, are the nonzero integers modulo p (i.e., for any 0≦X<p, X=αk modulo p for some 0≦k≦p−2). In such a case, X is said to have an index of k, modulo p.
Given the primitive root, multiplication modulo p can be performed by adding the indices modulo p−1. This is analogous to using logarithms in the binary number system. For example, α=2 is a primitive root modulo 13 because, the integers 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 210, and 211 modulo 13 are equal to 1, 2, 4, 8, 3, 6, 12, 11, 9, 5, 10 and 7, respectively. Thus, if X=5 (29 modulo 13) and Y=7 (211 modulo 13), X×Y=35 (28 modulo 13). Thus, the index of the product modulo p (8) of two RNS digits can be determined by adding the indices of the two RNS digits (9 and 11), modulo p−1 (i.e., (9+11) mod(12)=8).
In electronic circuit implementations, the RNS digits can be encoded in various ways. In conventional binary encoding, each RNS digit is converted to a binary number that is represented by the states of one or more lines, each of which is in one of two states to represent a binary digit of “0” or “1”. There is also the “one-hot” encoding scheme in which each possible value of an RNS digit is associated with a separate two-state line. For example, in the 11, 13, 17 RNS representation, 11 lines are used to represent the first RNS digit, 13 lines are used to represent the second RNS digit, and 17 lines are used to represent the third RNS digit. When an RNS digit has a given value, the line associated with that value is high and all of the other lines are low. Thus, only one line of a digit is high (or hot) at any given time.
The use of the one-hot encoding scheme with the RNS produces such compelling advantages in electronic circuit implementations that such a system is identified as the “One-Hot Residue Number System” (OHRNS). While the OHRNS is really the same RNS with the same arithmetic properties, the advantages of using one-hot encoding include basic operation implementation using barrel shifters with their superior delay-power products and operand-independent delays, simple and regular layout of arithmetic circuits, and zero-cost implementation through signal transposition of inverse calculation, index calculation, and residue conversion. When any RNS digit changes in value, at most two lines change state. This is the minimal possible activity factor and yields low power dissipation. Because in OHRNS implementations signal activity factors are near minimal and fewer critical path transistors are present, such systems have very low delay-power products. (A detailed explanation of OHRNS circuits can be found in W. A. Chren, Jr., “One-Hot Residue Coding for Low Delay-Power Product CMOS Design,” IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, v. 45, no. 3 (March 1998), pp. 303-313, which is herein incorporated by reference.)
With one-hot encoding of the RNS digits, addition can be performed through a cyclic shift (i.e., rotation). In particular, one of the operands is rotated by an amount equal to the value of the other operand. While such a rotation can be implemented using several different types of circuits, barrel shifters allow all possible rotations of the first operand to be computed in parallel. The second operand determines which of the rotations is output from the barrel shifter as the result. A conventional OHRNS modulo m1 adder is shown in FIG. 1(a). The adder 10 includes a modulo m1 barrel shifter 12 that performs the addition, and a static pipeline register 14 that stores the result for downstream processing. FIG. 1(b) shows the internal structure of the barrel shifter. As shown, NMOS pass transistors 16 are used instead of transmission gates to yield higher speed and lower power dissipation due to smaller input and output capacitive loadings (i.e., because there are half as many NMOS sources/drains per input/output line as when transmission gates are used).
Further, in the OHRNS, subtraction can be performed by adding the additive inverse of the subtrahend, and the additive inverse can be computed by a simple one-to-one mapping using signal transposition. FIG. 2 shows a conventional OHRNS modulo m1 subtractor. As shown, the subtractor 20 is identical to the adder 10 of FIG. 1(a) except for the use of signal transposition 22 on the subtrahend input to the barrel shifter 12. The signal transposition 22 computes the additive inverse quickly and simply through a one-to-one mapping of inputs to outputs.
Multiplication in the OHRNS can also be performed with barrel shifters by using indices. Indices and their additive: inverses, which are known as anti-indices, are the RNS equivalents of logarithms and antilogarithms, as explained above. The computation of indices and anti-indices in any modulus can be performed quickly and simply through a one-to-one mapping. In particular, such mappings in the OHRNS are implemented by merely permutating the signal lines of the RNS digit. In other words, indices and anti-indices can be computed through signal transpositions or wire permutations that require no active circuitry and introduce little or no delay.
FIG. 3 shows a conventional OHRNS modulo m1 multiplier that uses wire transpositions to compute indices and anti-indices. More specifically, the multiplier 30 uses signal transpositions 34, 36, and 38 on the input and output lines to compute the indices and anti-indices, and a barrel shifter 32 to add the indices. A small amount of combinational logic 39 is provided to handle the special case in which at least one of the operands is zero. The separate handling of this special case allows the barrel shifter 32 to perform addition modulo m1−1, rather than modulo m1. As in the adder 10 of FIG. 1(a), a static pipeline register 14 stores the resulting product for downstream processing.
While offering such advantageous characteristics and a very low delay-power product, conventional RNS arithmetic circuits are not testable. In particular, conventional RNS arithmetic circuits do not include simple test circuitry to allow verification of circuit functionality and timing. The input-to-output delay is one of the critical timing values of an RNS arithmetic circuit that must be verified to be within specification. Such timing verification must be provided before RNS arithmetic circuits can be practically used for digital signal processing in actual products such as hard disk drive read channels, wireless telecommunication integrated circuits, and image processing integrated circuits.