1. Field of the Invention
The present invention relates to a distributed arithmetic digital processing circuit. It is used in the construction of digital filters, which can in particular be employed in telecommunications.
2. Description of the Prior Art
The digital processing performed by the circuit according to the invention is a weighted sum which can be expressed by the relation: ##EQU1## in which u.sub.k designates different digital signals of rank k and in which a.sub.k designates weighting coefficients.
A special case of such a processing is recursive digital filtering conventionally characterized by an expression of form: ##EQU2## in which x.sub.n-k designates the input signals of rank n-k, y.sub.n-k output signals of rank n-k reinjected at the input of the filter, a.sub.k coefficients of the non-recursive part, whose order is p and b.sub.k coefficients of the recursive part, whose order is q, n assuming all the values and being an integer characterizing the rank of the output signal.
A recursive digital filter able to carry out such processing comprises p+1 inputs receiving the signals x.sub.k and q inputs receiving the signals y.sub.k, i.e. in total p+q+1 inputs. It also comprises an output supplying the sequence of signals y.sub.n.
The transfer function H(z) of such a filter, expressed by means of the complex variable z, is: ##EQU3##
The invention relates to a special technique for digital processing called "distributed arithmetic". The latter is described in U.S. Pat. No. 3,777,130 granted on Dec. 4th 1973 to A. Croisier, D. J. Esteban, M. E. Levilion and V. Riso and entitled "Digital filter for PCM encoded signals", as well as the article of A. Peled and B. Liu entitled "A new hardware realization of digital filters" published in the journal "IEEE Trans. on ASSP", vol. ASSP 22, pp. 456-462, December 1974. In addition, the article by C. S. Burrus entitled "Digital filter structures described by distributed arithmetic" published in the journal "IEEE Trans. on Circuits and Systems", vol. CAS-24, no. 12, December 1977, pp. 674-680 gives a general approach to this question and describes several algorithms and structures which can be used. Reference can be made to these documents for details concerning distributed arithmetic, which is only briefly described hereinafter in order to facilitate the understanding of the invention.
It involves effecting a weighted sum of the signals expressed by relation (1). It is assumed that the u.sub.k are coded on r bits in code complement to 2, but other codes are naturally possible. The sign bit is designated u.sub.k (O) and the bits representing the absolute value of u.sub.k are designated u.sub.k (j), j being between 0 (exclusive) and r-1 (inclusive). Thus, it is possible to write: ##EQU4## if -1.ltoreq.u.sub.k &lt;1
Then, after inversion of the summations it is possible to write v in the form: ##EQU5##
In this relation the term u.sub.k (j) appearing in the bracketed expression is the bit of rank j of u.sub.k and this bit can only assume two values: 0 or 1. For the first product u.sub.1 (j)a.sub.1 there are consequently only two possible values: 0 or a.sub.1. In the same way for the product u.sub.2 (j)a.sub.2 which can only assume the two values 0 and a.sub.2 there are consequently only four possible values for the sum of these two products, namely 0, a.sub.1, a.sub.2, or a.sub.1 +a.sub.2. Step by step it can be seen that there are finally 2.sup.p possible values for the sum: ##EQU6## This sum, designated w.sub.j (because it is dependent on the rank j in question) assumes any one of 2.sup.p values determined solely by the coefficients a.sub.k. Thus, the calculation of the weighted sum is a question of the calculation of the expression: ##EQU7## in which w.sub.j can be predetermined as soon as a set of coefficients a.sub.k has been fixed.
That of these 2.sup.p values to be retained for forming w.sub.j is defined by all the p bits of rank j of the input signals, namely u.sub.1 (j), u.sub.2 (j) . . . u.sub.p (j).
In a distributed arithmetic processing signal the quantities w.sub.j are entered in a memory of 2.sup.p words of m bits, if m is the number of bits necessary for expressing these partial sums. The word corresponding to a particular w.sub.j is, in this memory, located at the address corresponding to p bits of rank j of input signals u.sub.k.
The number m is determined on the basis of the constraints of the problem to be dealt with and can be deduced in two different ways: by truncation of the coefficients a.sub.k as in a conventional realization or by direct truncation of quantities w.sub.j (but then processing is no longer strictly linear). The latter method is described in the article of R. Lagadec and D. Pelloni entitled "A model for distributed arithmetic for filters with post-quantized look-up table" published in "National Telecommunication Conference" 1977, pp. 29: 3-1 to 29: 3-6.
If the input signals u.sub.k are introduced into series registers with a low weight at the head and if at the output of the memory containing the precalculated w.sub.j there is an arithmetic unit constituted by an adder-subtracter, an accumulator register with parallel inputs and parallel outputs which can be displaced it is possible to obtain the weighted sum v in r elementary operations such as the:
reading the memory at the address given by the p bits u.sub.k (j), PA1 addition of the content of the parallel-parallel accumulator register and the word contained in the memory at the indicated address (for j=0 it is necessary to subtract the content of the memory from the content of the register), PA1 shift of the parallel-parallel accumulator register and the input series registers (which corresponds to a division by two of the intermediate result). PA1 register 11/1 corresponding to y.sub.n-1 has parallel loading, the content of the parallel-parallel register 40 must be limited to r bits (by truncation or rounding off), then at the end of each calculation cycle (r elementary operations) is loaded into register y.sub.n-1 in parallel manner, PA1 the different series registers are interconnected. PA1 reading the memory at the address given by the n bits supplied by the multiplexers in the block selected by the content of the counter, PA1 addition of the word read in the memory at this address and of the content of the parallel-parallel register (subtraction for the sign bits), PA1 incrementing the content C of the counter by one unit by a clock pulse, PA1 shifting the series registers and parallel-parallel registers only when content C of the counter reaches M, on counting from 1 to M or M-1 on counting from 0 to M-1.
The material organization of such a circuit is illustrated in FIG. 1. The represented circuit comprises p series registers 10/1, 10/2, . . . 10/p of r bits receiving the p input signals 20, a read-only memory of 2.sup.p words of m bits, an adder-subtracter 30, a register with parallel inputs and parallel outputs 40, which can be shifted and a timing circuit 50.
The memory has p inputs connected to the outputs of p input registers and contains the 2.sup.p following quantities: ##EQU8## in which .epsilon..sub.k assumes the values 0 or 1, the address of each quantity being constituted by p bits applied to the p inputs.
Using such means the equivalent of p multiplications and (p-1) additions can be performed in r clock strokes, the operation being performed in the following manner.
The quantities u.sub.k appear with a low head weight in the registers 10/1, . . . 10/p. The first quantity calculated is: ##EQU9## then following the shift of register 40 2.sup.-1 w.sub.r-1, w.sub.r-2 is calculated and then the sum w.sub.r-2 +2.sup.-1 w.sub.r-1, and then after a further shift EQU 2.sup.-1 (w.sub.r-2 +2.sup.-1 w.sub.r-1)=2.sup.-1 w.sub.r-2 +2.sup.-2 w.sub.r-1
is calculated and so on up to the nth clock stroke where a subtraction corresponding to the sign bits u.sub.k (O) is performed. Following this final stroke the series registers 10/1 . . . 10/p must be shifted. It is also possible to truncate or round off result v. To effect this truncation it is merely necessary to not take account of the least significant bits as from a certain moment. For rounding off purposes it is necessary to add a low half-weight before truncating the results.
These principles can be applied to the realization of a recursive digital filter. The operation to be performed is then that of relation (2) and the corresponding circuit is that of FIG. 2. It comprises a non-recursive part constituted by p+1 series registers 10/0, 10/1 . . . 10/p receiving the signals x.sub.n, x.sub.n-1, . . . , x.sub.n-p and a recursive part constituted by q registers 11/1, 11/2 . . . 11/q receiving the signals y.sub.n-1, y.sub.n-2, . . . y.sub.n-q, register 11/1 having parallel loading and the other registers series loading. The output of registers 10/0 and 10/p-1 is connected to the input of the following register, the same as with registers 11/1 to 11/q-1. The circuit also comprises a memory 20 with p+q+1 inputs connected to the output of the registers, said memory containing 2.sup.p+1+1 words of m bits, said words being the quantities: ##EQU10## in which .epsilon..sub.j are equal to 0 or 1, j passing from 1 to p+q+1.
Finally the circuit comprises the aforementioned arithmetic processing means, namely an adder-subtracter 30, an accumulator register 40 with parallel inputs and parallel outputs, as well as a timing circuit or clock 50 ensuring the appropriate performance of the operation (shifting registers, loading parallel registers, resetting the accumulator at the end of the cycle).
There are the two following modifications compared with the circuit of FIG. 1:
The reading frequency of the memory and the shift of the registers is equal to rf.sub.E if f.sub.E is the sampling frequency of the digital filter. If the data x.sub.k enter the device in series form with a low weight at the head, at the correct speed and with the correct number r of bits the first register 10/0 is not indispensable. If it is retained with to some extent a buffer register function the result is delayed by one cycle 1/f.sub.E. In certain applications the data are in parallel form, so that the first register must then have parallel loading.
In such a device the size of the memory increases significantly as soon as the orders p and q of the filter reach values of approximately 6, corresponding to the "templates" necessary for the transmission of data by coded pulses (PCM).
One solution for reducing the size of the memory consists of using one or more multiplexers, but this increases the working frequency. M multiplexers can be located at the output of the memory, as described in the article by C. S. Burrus referred to hereinbefore. These multiplexers have M inputs and an output. Such a multiplexer is symbolized hereinafter by the notation M.fwdarw.1. It is controlled by a modulo M counter with .beta. outputs receiving the pulses from the clock. The memory then comprises M different memories, each of 2.sup.n words of m bits, said memories having n addressing inputs and an output. In all the circuit has nM input registers distributed into n groups, each associated with a memory. The size of the memory system is then M.2.sup.n words of m bits, which is below the size of 2.sup.nM such as would occur with a single memory without multiplexers in order to be able to process nM input signals.