1. Field of the Invention
The present invention relates to a two-dimensional array transposition circuit reading a two-dimensional array in an order different from that for writing. The invention more specifically relates to a two-dimensional array transposition circuit having a small circuit scale and small power consumption.
2. Description of the Background Art
A high-efficiency coding system for image data such as MPEG (Moving Picture Experts Group) 2 which is being developed now is based on a transform coding technique using two-dimensional discrete cosine transform (DCT). The transform coding refers to a coding system for reducing the spatial redundancy of image data. Reduction in the spatial redundancy of image data is achieved by transforming image data onto an axis of spatial frequency using orthogonal transformation, and coding only a component on which energy is concentrating.
Examples of the orthogonal transformation are, in addition to the two-dimensional DCT, two-dimensional fast Fourier transform (FFT) employed for filtering of an image, Hadamard transform by which simplification of hardware is possible, Karhunen-Loeve transform (K-L transform) with a highly efficient coding but with its operation process more complex than the two-dimensional DCT, and the like.
The two-dimensional DCT is employed below as a representative orthogonal transformation and a method of implementing the two-dimensional DCT for image data is described. A similar configuration can be used for implementing the two-dimensional FFT or the like.
Image data is divided into blocks formed of N.times.N pixels. The two-dimensional DCT is performed for respective blocks. The two-dimensional DCT is implemented by equation (1). ##EQU1##
In equation (1), f(i, j)(i, j=0, 1, . . . , N-1) is an original signal of an image, and F(u, v)(u, v=0, 1, . . . , N-1) is a coefficient obtained by the transformation. The operation performed for the two-dimensional DCT can be understood as substantially the product-sum operation from equation (1). Although to make the two-dimensional DCT circuit into a Large Scale Integrated Circuit (LSI) has been considered to be difficult since a significant amount of hardware is required for implementing a multiplier, improvement of the micro lithography and investigation of fast algorithm enables implementation of LSI.
However, to execute equation (1) exactly is actually difficult still now in view of the circuit scale. A generally employed method is to divide the two-dimensional DCT into row-direction one-dimensional DCT and column-direction one-dimensional DCT.
Referring to FIG. 1, a conventional two-dimensional DCT device includes: a row-direction one-dimensional DCT circuit 100 receiving input data for performing the one-dimensional DCT in a row direction; a two-dimensional array transposition circuit (hereinafter referred to as "transposition memory circuit") 101 receiving a two-dimensional array formed of one-dimensional arrays supplied from row-direction one-dimensional DCT circuit 100, and transposing the two-dimensional array for outputting it; and a column-direction one-dimensional DCT circuit 102 receiving successively the one-dimensional arrays constituting a two-dimensional array supplied from transposition memory circuit 101 and performing one-dimensional DCT in column direction for outputting a transformation coefficient obtained by the two-dimensional DCT.
Referring to FIG. 2, transposition memory circuit 101 includes: two memory cell arrays 106 and 107 each formed of N.times.N pixels; a switch 108 for switching input/output to/from memory cell arrays 106 and 107 by an external control signal; and address translation circuits 109 and 110 for translating an address signal into a read/write address for memory cell arrays 106 and 107.
Methods of implementing the one-dimensional DCT are briefly described first. The methods of implementing the one-dimensional DCT can be roughly divided into two methods. The first one is to perform operation according to a defining equation of the one-dimensional DCT. According to this method, if image data is divided into blocks each formed of 8.times.8 pixels, for example, a parallel operation using eight multipliers or a series operation using one multiplier is performed.
The second method is the one using a fast algorithm. One example of the algorithm is the one by Chen in which the number of multiplications is reduced by 50% using the symmetricalness of a coefficient matrix of the product-sum operation. One example of implementation of the one-dimensional DCT utilizing the Chen's algorithm is disclosed in IEEE Journal of Solid-State Circuits, vol. 27, No. 4, April 1992, S. Uramoto et al., "A 100-MHz 2-D Discrete Cosine Transform Core Processor."
An operation of transposition memory circuit 101 is next described. Transposition memory circuit 101 receives an output array which was subjected to the row-direction one-dimensional DCT by circuit 100, and outputs the array as an input array for circuit 102 to perform the column-direction one-dimensional DCT. An input array required by circuit 102 is the one generated by writing the output array from circuit 100 into memories 106 and 107 in a two-dimensional array, transposing the written two-dimensional array and reading a resultant array. Transposition memory circuit 101 writes the output array from circuit 100 into memories 106 and 107 in a two-dimensional array, transposes it, reads a resultant array and supplies it to circuit 102.
Suppose that N=4 for the N.times.N data block. It is assumed that an output array from circuit 100 is EQU {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
and these elements are successively output from the leftmost element of the array above. As shown in FIG. 3A, data are supplied to memory cell arrays 106/107 in the order shown by the arrow 104. FIG. 3B shows a two-dimensional array of the data in memory cell arrays 106/107 immediately after the data are supplied.
On the other hand, an input array required when the column-direction DCT is performed by circuit 102 is the one obtained by exchanging the row and column of the two-dimensional array and reading a resultant array, that is, EQU {0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15}.
The array above is obtained by reading data from memory cell arrays 106/107 in the order of the arrow 105 as shown in FIG. 3C.
If switch 108 is in the state shown in FIG. 2, transposition memory circuit 101 receives an output from row-direction one-dimensional DCT circuit 100 (pre-stage DCT output) and writes it into memory cell array 106. Further, transposition memory circuit 101 reads data from memory cell array 107 as an input to column-direction one-dimensional DCT circuit 102 (post-stage DCT input). In this case, suppose that an address of each memory cell is as shown in FIG. 4, then an address supplied by address translation circuit 109 to memory cell array 106 is the one shown below which is an address in which priority is given to the row direction (row direction priority address). EQU {0, 1, 2, . . ., N, N+1, N+2, . . . , N.sup.2 -2, N.sup.2 -1}
On the other hand, an address supplied by address translation circuit 110 to memory 107 is the one shown below which is an address in which priority is given to the column direction (column direction priority address). EQU {0, N, 2N, . . . , 1, N+1, 2N+1, . . . , N.sup.2 -N-1, N.sup.2 -1}
In transposition memory circuit 101, when writing of data corresponding to one block formed of N.times.N pixels into one memory cell array (memory cell array 106 in FIG. 2) and reading of data from the other memory cell array (memory cell array 107 in FIG. 2) are completed, a control signal in synchronization with each data block (one data block is constituted of N.sup.2 data) causes switch 108 to change its state and operations carried out in respective memory cell arrays 106 and 107 become opposite to that described above.
Specifically, the pre-stage DCT output is written into memory cell array 107, and the post-stage DCT input is read from memory cell array 106. At this time, an address supplied to each memory cell array is a column direction priority address for memory cell array 106 and a row direction priority address for memory cell array 107.
As described above, data in a preceding block written into memory cell array 106 with a row direction priority address is read with a column direction priority address during data in a current block is written into memory cell array 107, so that the transposition operation is achieved.
Although such a transposition memory circuit for a square two-dimensional array is disclosed in Japanese Patent Laying-Open No. 6-223099, a specific example of a configuration of the transposition memory circuit is not described in the publication.
However, the configuration of the conventional transposition memory circuit 101 requires two memory cell arrays 106 and 107, resulting in a large circuit scale.
In addition, an important object of the current LSI development is to reduce power consumption. One approach of reducing power consumption is to lower supply voltage or the like. In addition, to reduce the number of signal transition is also one approach thereof.
Since in the configuration of the conventional transposition memory circuit 101, data are read/write from/into two memory cell arrays 106 and 107 independently of each other, the number of signal transition increases to make the reduction of power consumption difficult.