The instant invention relates to integrated circuits for digital signal processing, more specifically for circuits that perform a double weighted summation, first in line then in column, of digital values x(i,j) of a n.times.n digital value matrix.
From coefficients x(i,j) where i is a line index of the matrix, and j a column index, one tries to provide for a matrix of coefficients C(u,v) where u is a line index and v a column index, with ##EQU1##
From the input electrical signals representing the digital values x(i,j) n.times.n signals representing coefficients C.sup.i (v) are produced; each coefficient C.sup.i (v) represents a weighted summation of values x(i,j) of the line i multiplied by coefficients f(j,v); v represents a column index varying from 0 to n-1 and there are n coefficients C.sup.i (v) for each line of index i. This operation is called line transformation.
From the n.times.n signals representing the coefficients C.sup.i (v), n.times.n signals representing the coefficient C(u,v) are provided; each coefficient C(u,v) is a weighted summation of the values C.sup.i (v) of the column v, multiplied by coefficients g(i,u); u represents a line index varying from 0 to n-1 and n coefficients are provided for each column of index v. This operation is the column transformation.
This type of digital processing is especially used for carrying out transformations called cosinus transformations wherein the coefficients f(j,v) and g(i,u) are of the cos (2i+1)u.pi./2n; those transformations are useful for permitting the compression of information in the digital transmission of signals, and more specifically for the digital transmission of pictures.
The integrated circuit architectures used to carry out this kind of transformation are relatively complex since they must permit a real time processing, that is, the flow of digital data to be processed is assigned at the input of the circuit and the flow of the processed data at the output must be as fast as the flow at the input. Of course, this flow is high and, as an example, for the digital transmission of pictures, one wishes to be able to process a block of 16.times.16 digital values (256 pixels) within less than 20 microseconds with successive blocks of 256 values arriving at the input of the circuit with a periodicity in the range of 20 microseconds.
FIG. 1 shows a block diagram of a relatively simple integrated circuit architecture that may be devised for carrying out, on one integrated circuit chip, the entire transformation of a block of n.times.n digital values x(i,j) into a block of n.times.n coefficients C(u,v).
In this diagram, one uses a first operator, line transformer circuit 1 CTL performing the line summation, a second operator, column transformer circuit 2 CTC performining the column summation, two memories 3 and 4 for storing values representing the coefficients C.sup.i (v), and two mixing circuits 5 and 6 for establishing connection paths, on the one hand between the line transformer circuit 1 and memories 3 and 4, and on the other hand between those memories and the column transformer 2. The whole set is controlled by a sequencer 7.
One block of n.times.n data x(i,j) to be processed is fed through an input bus E to the line transform operator CTL which provides n.times.n digital data representing n.times.n digital coefficients C.sup.i (v). Those data are stored in the n.times.n addresses of the memory 3 (memory of n.times.n words). The processing speed of the blocks of n.times.n data is, for example, of one block every 20 microseconds; a data x(i,j) arrives, for example, every 74 nanoseconds (for n.times.n=256).
To process the following block of n.times.n values x(i,j) circuit 1 receives the successive values x(i,j) and carries out the line transformation; then, the sequencer 7 controls the mixing circuit 5 in order to store the results C.sup.i (v) into the second memory 4. Meanwhile, the data previously recorded in the memory 3 are applied as input digital values to be processed, through the mixing circuit 6, onto the column transformation circuit 2 that provides the coefficients C(u,v) at its output.
Then, alternately, one block of n.times.n coefficients C.sup.i (v) is stored in one of the memories while one block of coefficients C.sup.i (v), recorded in the other memory during the previous time period, is processed.
This architecture is elegant but requires two memories, each one being capable of storing n.times.n data C.sup.i (v). It must be clearly understood that to carry out a column transformation on the coefficients C.sup.i (v) all the coefficients C.sup.i (v) of column v have to be stored. However, since those coefficients C.sup.i (v) arrive from the circuit CTL line after line, and not column after column, this practically means that the column transformation can start only when all the coefficients C.sup.i (v) of the matrix have arrived from circuit 1. This is the reason why the architecture shown in FIG. 1 uses two memories operating alternately. Moreover, it must be appreciated that if data C.sup.i (v) are recorded in one memory line after line (i being the line index), they must be read at the following time period column after column (v being the column index).
Moreover, if this architecture is used together with operators 2 and 1 processing digital data, the bits are transmitted in series (or partially transmitted in series) in contrast to parallel transmission. Therefore one has to provide for series/parallel and parallel/series converters between the operators and the memories, because the conventional SRAM or DRAM memories can only process data bits which are fed in parallel form. With data words greater than 4 bits, operators 2 and 1 operating on series or series/parallel bits should be provided.