1. Field of the Invention
The present invention relates to a discrete cosine transform apparatus and an inverse discrete cosine transform apparatus for use in digital image processing or the like.
2. Description of the Prior Art
There are known various discrete orthogonal transforms which are suitable for digital image processing. Of such discrete orthogonal transforms, a discrete cosine transform (DCT) is suitable for compressing band bandwidth and a processing system thereof is relatively simple.
In the DCT, in the case of N-degree, by utilizing a matrix [N] formed of elements of 1/.sqroot.2 on first row and elements under second row expressed as EQU cos {(2x+1) k.pi./2N} EQU (x=0, 1 . . . N-1; k=1 . . . N-1)
transform and inverse transform (IDCT) are defined. In the case of two dimension, the following equations are established: EQU [Y]=[N][X].sup.t [N] EQU [X]=.sup.t [N][Y][N]
Although a coefficient 1/2.sup.N+1 is multiplied to the equation (1) when the scale of matrix is 2.sup.N rows and 2.sup.N columns, it is equivalent to the data shift of N+1 bits and therefore such coefficient need not be described.
The matrix data shown in the equations (1),(2) are multiplied with each other by using a multiplying apparatus formed of an inner product calculating circuit and a rearranging circuit shown in FIG. 1. In FIG. 1, reference numerals 10 and 20 depict inner product calculating circuits, each of which is formed of four-degree arrangement corresponding to a matrix of 4 rows and 4 columns. The inner product circuits 10 and 20 are connected to each other by means of a rearranging circuit 30.
Data matrix [X] shown in equation (3) is input from a terminal IN and an inner product calculation of coefficient matrix [A] shown in equation (4) is carried out by one inner product calculating circuit 10. ##EQU1##
In the inner product calculation circuit 10, three unit delay elements 11.sub.1, 11.sub.2, 11.sub.3 are connected in series in the inverse order and four latches 12.sub.1, 12.sub.2, 12.sub.3 and 12.sub.4 are respectively connected to output end, two junctions and input end of the three unit delay elements 11.sub.1, 11.sub.2 and 11.sub.3. Coefficient ROMs (read only memories) 14.sub.1 to 14.sub.4 are respectively connected to multipliers 13.sub.1 to 13.sub.4 that are connected to the latches 12.sub.1, to 12.sub.4. Outputs of the respective multipliers 13.sub.1 to 13.sub.4 are connected to an adder 15. Thus, the inner product calculation circuit 10 is arranged as a finite impulse response (FIR) type transversal filter.
Similarly, the inner product calculating circuit 20 is arranged as an FIR type transversal filter. Reference numerals in the second digits of corresponding elements of the inner product calculating circuit 20 are replaced with [2] and the corresponding elements are therefore need not be described in detail. A coefficient b.sub.iL stored in ROMs 24.sub.1 to 24.sub.4 is different from a coefficient a.sub.1j of the ROMs 14.sub.1 to 14.sub.4.
The rearranging circuit 30 comprises a pair of RAMs 31, 32 and change-over switches 33 and 34 of input and output sides. The two switches 33 and 34 are switched in a ganged relation so that data is read out from the other RAM during data is written in one RAM. The capacity of the RAMs 31, 32 are selected to be 16 words in response to the matrix of 4 rows and 4 columns.
A matrix data multiplication according to the example of the prior art shown in FIG. 1 will be described with reference to FIGS. 2A to 2J.
From the input terminal IN, there are supplied data a of input matrix [X] of 16-word unit shown in FIG. 2A in the sequential order of first column (x.sub.11, x.sub.21, x.sub.31, x.sub.41) to fourth column (x.sub.14, X.sub.24, X.sub.34, X.sub.44).
At a timing point t.sub.1 where a time 3T of 3 cycles is passed from an input start timing point t.sub.0 of the unit data, the first column data x.sub.11, x.sub.21 and x.sub.31 exist on the respective output ends of the unit delay elements 11.sub.1, 11.sub.2 and 11.sub.3 and the fourth data x.sub.41 exists in the input end of the delay element 11.sub.3.
Under this condition, a common enable pulse is supplied to the respective latches. The four data x.sub.11, x.sub.21, x.sub.31, and x.sub.41 of the first column are latched in the four latches 12.sub.1, 12.sub.2, 12.sub.3 and 12.sub.4 and held therein over the period of 4T from a timing point t.sub.2 which is behind the input start timing point t.sub.0 by 4T time.
Coefficients a.sub.i1, a.sub.i2, a.sub.i3 and a.sub.i4 (i=1, 2, 3, 4) of respective columns of the coefficient matrix [A] are stored in the ROMs 14.sub.1, 14.sub.2, 14.sub.3 and 14.sub.4. As shown in FIGS. 2C, 2E, 2G and 2J, these coefficients are sequentially supplied to the corresponding multipliers 13.sub.1, 13.sub.2, 13.sub.3 and 13.sub.4 at every cycle after the timing point t2, thereby being multiplied respectively with the data x.sub.i1 (i=1, 2, 3, 4) of the first column held in the corresponding latches 12.sub.1, 12.sub.2, 12.sub.3 and 12.sub.4.
That is, at first, second, third and fourth cycles after the timing point t.sub.2, coefficients a.sub.1j, a.sub.2j, a.sub.3j and a.sub.4j (j=1, 2, 3, 4) of first, second, third and fourth rows of the coefficient matrix are multiplied with the first column data X.sub.11, X.sub.21, X.sub.31, and x.sub.41.
The adder 15 adds outputs of the respective multipliers 13.sub.1 to 13.sub.4 and produce first column data u.sub.11, u.sub.21, u.sub.31 and u.sub.41 of matrix [U] of product shown in the following equation (5) at four cycles after the timing point t.sub.2. EQU [U]=[A][X] (5)
On the other hand, as shown in FIG. 2A, the input of second column data x.sub.12, x.sub.22, x.sub.32 and x.sub.42 of matrix [X] is started at the timing point t.sub.2. Similarly as described before, at a timing point t.sub.3 which is behind of the timing point t.sub.2 by 4T time, the second column data x.sub.12, x.sub.22, x.sub.32 and x.sub.42 are respectively latched in the latches 12.sub.1, 12.sub.2, 12.sub.3 and 12.sub.4. At every cycle after the timing point t.sub.3, coefficients a.sub.i1, a.sub.i2, a.sub.ai3 and a.sub.i4 (i=1, 2, 3, 4) of respective columns of the matrix [A] are sequentially output from the ROMs 14.sub.1, 14.sub.2, 14.sub.3 and 14.sub.4 similarly as described above.
Similarly as described above, at four cycles after the timing point t.sub.3, second column data u.sub.12, u.sub.22, u.sub.32 and u.sub.42 of product matrix [U] shown in the equation (5) are obtained.
Similarly as described above, at four cycles after the timing point t.sub.4, third column data u.sub.13 to u.sub.43 of product matrix [U] are obtained and at four cycles after the next timing point t.sub.5, fourth column data u.sub.14 to u.sub.44 of product matrix [U] are obtained.
Data of 16 words arranged the order of column of the matrix [U] thus obtained are alternately written in the RAMs 31, 32 of the rearranging circuit 30. By changing the write address and the read address, data of the matrix [U] alternately read out from the RAMs 31, 32 in the order of rows are supplied to the second inner product calculating circuit 20, in which they are multiplied with the second coefficient matrix [B] in exactly the same way as the above method. Therefore, data of product matrix [Y] expressed by the following equation (6) is developed at an output terminal OUT. EQU [Y]=}U]}B]=[A][X][B] (6)
When the matrix is formed of 8 rows and 8 columns, constant matrix [N] of the equation (1) is expressed by the following equation (7): ##EQU2##
As shown in FIG. 3, each of elements a to n represents cosine of a predetermined angle where angle .pi./16 is taken as the unit.
As clear from the equation (1) which defines DCT and IDCT, element Y.sub.ij of matrix [Y] is expressed by linear expression of element x.sub.ij of matrix [X].
Accordingly, as shown in FIG. 4, a relation expressed by the following equation (8) is established between [Xc] in which elements x.sub.11 to x.sub.88 of 8 rows and 8 columns are input in the order of columns to form 64-degree vector and [Yc] where elements y.sub.11 to y.sub.88 of 8 rows and 8 columns are input in the order of columns to form 64-degree vector. EQU [Yc]=[M][Xc] (8)
where [M] is the constant matrix of 64 rows and 64 columns.
When the conventional matrix data multiplying apparatus performs the calculation shown in equation (8), such data are simultaneously calculated by using 64-degree inner product calculating circuit. Consequently, the circuit scale is increased considerably and the circuit arrangement becomes complicated. Also, the number of calculations is increased and hence the calculation speed is limited.