Generally, in data communication, data to be transmitted is compressed at a transmitter and then the compressed data is converted into an original data format at a receiver. Such compressed data is subject to the transform and, for this end, the most widely known method is called the 2-D DCT. For instance, for a given 2-D data sequence [X.sub.ij : i, j=0, 1, 2, . . . , N-1], the 2-D DCT sequence [Y.sub.mn : m, n=0, 1, 2, . . . , N-1] is given by ##EQU1## where a scale factor ##EQU2## may be neglected for convenience. Then, a denormalized 2-D DCT form y.sub.mn of the 2-D DCT sequence is defined as ##EQU3##
Accordingly, it is noted from the above Formula (2') that the denormalized 2-D DCT sequence y.sub.mn is expressed in terms of N 1-D DCT's by implementing ##EQU4## in the row direction, and thereafter implementing ##EQU5## in the column direction.
That is, from Formula (2'), the 2-D DCT can be expressed by Formula (3') as shown below. ##EQU6##
Therefore, in order to implement a DCT for a N.times.N2-D digital data based on Formulas (2') and (3'), a system constituted as illustrated in FIG. 1 may be used. That is, 1-D DCT are implemented N times in the row direction for the N.times.N2-D digital data input, and the resulting outputs therefrom are then transposed in the form of a matrix by a matrix transposer 2. Then, the resulting outputs of the matrix transposer 2 are subject to N 1-D DCT's in the column directions, thereby obtaining the resulting 2-D DCT output Y.sub.mn.
However, in the case where a 2-D DCT is implemented for N.times.N 2-D digital data in the above described method, the DCT computing time is delayed due to the implementation of 2N 1-D DCT's, and at the same time, the hardware construction therefor becomes complicated, thereby making it difficult to realize a high density VLSI. That is, a 1-D DCT circuit is generally comprised of a number of adders and multipliers; however, using the large number of 1-D DCT's included in a circuit implies that a great number of multipliers must be used. It is therefore understood from the foregoing that if a 2-D DCT circuit includes a large number of 1-D DCT's, the computing time is increased and the construction of the hardware becomes complicated. Furthermore, since the input digital data is subject to the 1-D DCT in the row direction, the output thereof being transposed in the form of the matrix so as to perform 1-D DCT for the transposed data in the column direction, there are great difficulties in constituting the matrix transposer in the form of hardware.
However, in order to perform the real time data compression for a great amount of data, a fast DCT scheme is necessarily required. Preferably, a scheme such as a parallel structure with a fast transform speed and low complexity, is required. In attempts to overcome the above described problems which are inherent in the method of FIG. 1, various methods have been proposed. A variety of examples are disclosed in "A TWO-DIMENSIONAL FAST COSINE TRANSFORM" by M. A. Hague [IEEE Trans. Acoust., Speech, Signal processing, Vol. ASSP-33, PP. 1532-1539, Dec. 1985.], and "A FAST RECURSIVE TWO-DIMENSIONAL COSINE TRANSFORM" by C. Ma [Intelligent Robots and Computer Vision: Seventh in a series, David P. Casasent, Editor, Proc. SPIE 1002, PP. 541-548, 1988].
In the above publications, there are proposed several ways to carry out the 2-D DCT, without using the 1-D DCT. Thus, if the 2-D DCT is performed in the above method, a separate hardware is required regardless of the 1-D DCT circuit. Furthermore, for the implementation of the DCT, the number of multiplications is reduced by about 25 percent, compared with the conventional methods. Nevertheless, the DCT computing time is still long, causing a problem in implementing the fast DCT.