The two-dimensional discrete cosine transform is a pair of mathematical equations that transforms one N.times.N array of numbers to or from another N.times.N array of numbers. The first array typically represents an N.times.N array of spatially determined pixel values which form the digital image. The second array is an array of discrete cosine transform coefficients which represent the image in the frequency domain. This method of representing the image by the coefficients of its frequency components is a special case of the discrete Fourier transform. The discrete Fourier transform is the discrete version of the classic mathematical Fourier transform wherein any periodic waveform may be expressed as a sum of sine and cosine waves of different frequencies and amplitudes. The discrete cosine transform, like the Fourier transform, is thus a transform which transforms a signal from the time domain into the frequency domain and vice versa.
It is well known in the art to use these discrete cosine transform coefficients for image compression. In combination with other techniques, such as color subsampling, quantization, Huffman coding and run-length coding, the discrete cosine transform can compress a digital color image by a factor of approximately thirty to one with virtually no noticeable image degradation. Because of its usefulness in image compression, the discrete cosine transform is an integral part of several image compression standards used by international standards committees such as the International Standards Organization.
There are two basic discrete cosine transform equations. The first basic equation is the forward discrete cosine transform which transforms the pixel values into the discrete cosine transform coefficients. The second basic equation is the inverse discrete cosine transform which transforms the discrete cosine transform coefficients back into pixel values. Most applications of the discrete cosine transform for images use eight-by-eight arrays wherein N therefore has a value of eight. Assuming then that N has the value of eight when performing the transforms, where p(i,j) are the values of the pixel array and f(u,v) are the values of the discrete cosine transform coefficients, the equations of the discrete cosine transforms are as follows.
The forward discrete cosine transform may be expressed as: ##EQU1##
The inverse discrete cosine transform may be expressed as: ##EQU2##
The forward and reverse discrete cosine transform operations of Equation (1) and Equation (2) may be represented by the diagram shown in prior art FIG. 1. Forward discrete cosine transform 12 of Equation (1) transforms eight-by-eight array 10 of pixel values in the spatial domain to an eight-by-eight array 16 of discrete cosine transform coefficients in the frequency domain. Inverse discrete cosine transform 14 performs the reverse transformation. Discrete cosine transform coefficient array 16 represents the frequency content of the p(i,j) function, separately, in both the horizontal and vertical directions. Since discrete cosine transform 12 operates on two-dimensional array 10, it is referred to as two-dimensional discrete cosine transform 12. In the result of two-dimensional transform 12 the horizontal dimension of coefficient array 16 represents the horizontal frequency and the vertical dimension of coefficient array 16 represents the vertical frequency.
In f(u,v) array 16 or frequency domain array 16, frequencies increase with increasing value of the (u,v) indices. Discrete cosine transform coefficient 17 in the upper left corner of array 16 corresponds to zero frequency in both horizontal and vertical directions, and is referred to as the DC term of frequency domain array 16. From forward discrete cosine transform Equation (1), it can be seen that f(0,0) coefficient 17 is equal to the average of all sixty-four f(i,j) pixel values, multiplied by a scaling factor. It is well known in the art that f(0,0) coefficient 17 or DC term 17 has this characteristic. The other sixty-three (u,v) coefficients within discrete cosine transform coefficient array 16 represent amplitudes of cosine waves of increasing frequency and are therefore AC coefficients.
Referring now to prior art FIGS. 2A-D, there are shown eight-by-eight spatial domain arrays 30, 32, 34, and 36 of pixels and their corresponding frequency domain discrete cosine transform coefficient arrays 38, 40, 42 and 44. Darker squares in spatial domain arrays 30, 32, 34, 36 represent darker pixels. It will be understood that the relative darkness of squares in arrays 30, 32, 34, 36, 38, 40, 42, 44 is represented by the density of stippling. Darker squares in frequency domain arrays 38, 40, 42, 44 represent smaller discrete cosine transform coefficients and lighter squares represent larger coefficients.
The image represented by spatial domain array 30 is thus completely flat since all squares in array 30 are the same level of darkness. Therefore discrete cosine transform array 38, corresponding to spatial domain array 30, contains only a single DC term 38a. All other sixty-three coefficients of discrete cosine transform array 38 are zero and are thus represented as black with the brightest density of stippling.
The image represented by spatial domain array 32 however, has a slowly varying horizontal gradient. Therefore, discrete cosine transform array 40 or frequency domain 40, corresponding to spatial domain array 32, contains not only DC term 40a but also contains a small number of low-frequency horizontal frequency terms 40b,c. The high-frequency horizontal terms and all vertical frequency terms of discrete cosine transform array 40, however, are all zero.
The image represented by spatial domain array 34 contains sharp horizontal edge 34a. Sharp horizontal edge 34a causes all eight vertical frequency bands represented by squares 42a-h of discrete cosine transform coefficient array 42 to indicate energy. Because horizontal edge 34a of spatial domain array 34 is encountered when tracing vertically through the image represented by array 34, and all eight cosine waves represented by squares 42a-h are needed to produce a sharp edge after forward discrete cosine transform 12 is performed. Thus all eight terms 42a-h of corresponding discrete cosine coefficient transform array 42 are non-zero and are indicated as lighter than the remaining squares of array 42 with a lower density of stippling. All fifty six horizontal frequencies of corresponding discrete cosine transform array 42 are zero and are represented as dark squares because the image of spatial domain array 34 is smooth in the horizontal direction.
The image represented by spatial domain array 36 contains a single isolated pixel 36a. This produces energy in all sixty-four discrete cosine transform coefficients of corresponding discrete cosine coefficient transform array 44. This energy is produced in all sixty-four cosine terms of frequency domain array 44 because isolated pixel 36a of spatial domain array 36 produces a sharp transition in both the horizontal and vertical directions. Thus sixty-four cosine terms must be provided to produce a single-pixel two-dimensional impulse function as represented by pixel 36a of spatial domain array 36.
Another important feature of discrete cosine transforms 12, 14 is that they are reversible, information-preserving, transformations. This means that exactly the same image information is present in spatial domain p(i,j) array 10 as in frequency domain f(u,v) array 16. The information in arrays representations 10, 16 is merely represented in different forms. Both array representations 10, 16 contain exactly the same information, and they can be converted from one representation to the other by applying the appropriate discrete cosine transformation 12, 14.
It is this reversible feature of discrete cosine transforms 12, 14 which permits them to be very useful for image compression. Each eight-by-eight p(i,j) array, such as spatial domain arrays 30, 32, 34, and 36, contains values representing an eight-by-eight portion of an image. For image compression arrays 30, 32, 34, and 36 may each be transformed into corresponding eight-by-eight f(u,v) arrays 38, 40, 42, and 44 in the frequency domain using forward discrete cosine transform 12 as set forth as Equation (1). Various compression algorithms may then be applied to f(u,v) arrays 38, 40, 42, and 44. To reconstruct the original images represented by spatial domain arrays 30, 32, 34, and 36, the compression algorithm is reversed to yield f(u,v) arrays 38, 40, 42, and 44. Inverse discrete cosine transform 14, set forth as Equation (2), is then applied to f(u,v) arrays 38, 40, 42, and 44 to provide p(u,v) arrays 30, 32, 34, and 36.
Referring to Equation (2), it can be seen that each p(i,j) is a result of a double summation. For a given i,j,u, and v, all the constants, C(u), C(v), cosval(i,u), and cosval(j,v), may be combined into one constant K(i,j,u,v). The K(i,j,u,v) constant may even include the (1/4) in these constants. Thus Equation (2) becomes: ##EQU3##
Thus straight forward calculation of the inverse discrete cosine transform of Equation (2) requires sixty-four floating-point multiplications per pixel. This is approximately five million multiplications for a modest size three-hundred twenty by two-hundred forty pixel image.
However, there are methods known in the prior art to reduce this computational requirement by several orders of magnitude. One method known in the prior art reduces the computation required to sixteen integer multiplications per pixel. Another prior art method, based upon further mathematical manipulation, can calculate the discrete cosine transforms 12, 14 using only two and three-quarters multiplications per pixel. However, even these methods still require too much computation to perform discrete cosine transforms 12, 14 for many applications.
These prior art methods must allow for the fact that the dynamic range of the f(u,v) values is eight times the range of the p(i,j) values,.+-.1024 versus.+-.128. This is required in order to preserve all the information in the p(i,j) spatial domain values when they are converted to the f(u,v) values of the frequency domain. Additionally, these prior art methods must allow for the fact that both the p(i,j) values of the spatial domain and f(u,v) values of the frequency domain are signed numbers. If forward discrete cosine transform 12 is used on eight-bit pixel values, for example, which have a numerical range of zero to two hundred fifty five, then one hundred twenty eight must subtracted from the pixel values to cause them to have the required range of negative one hundred twenty eight to positive one hundred twenty seven. This subtraction must be performed prior to using forward discrete cosine transform 12. Similarly, after performing inverse discrete cosine transform 14 one hundred twenty eight must be subtracted to return the pixel values to their proper range.
Another known improvement reduces the number of multiplications to sixteen per pixel, as previously described. This improvement was developed by determining that inverse discrete cosine transform Equation (2) may be rewritten as: ##EQU4## Both two-dimensional inverse discrete cosine transform 14 and forward discrete cosine transform 12 are separable transforms. Thus, two-dimensional inverse discrete cosine transform 14 can be separated into two one-dimensional inverse discrete cosine transforms. Each of the one-dimensional transforms is being of the form: ##EQU5##
Equation (5) represents a one-dimensional discrete cosine transform between two eight-element arrays p(i) and f(u). The separability of two-dimensional discrete cosine transforms 12, 14 thus means that they can be calculated as sixteen one-dimensional discrete cosine transforms. One-dimensional inverse discrete cosine transforms are first performed on each of the eight rows of f(u,v) matrix 16 to form a resulting intermediate array (not shown). One-dimensional discrete cosine transforms are then performed on each of the eight columns of the resulting intermediate matrix. The result of these operations is the same p(i,j) coefficient array 10 as computed using inverse discrete cosine transform 14.
According to Equation (5), there are sixty-four multiplications in a one-dimensional discrete cosine transfer. Sixteen one-dimensional discrete cosine transforms are required for two-dimensional discrete cosine transforms 12, 14. Thus a total of sixteen times sixty-four multiplications ar required for two-dimensional discrete cosine transform 12, 14 using this prior art method. Since there are sixty-four pixels this equals sixteen multiplications per pixel. Thus the number of multiplications per pixel is reduced by a factor of four in this prior
Still further improvement in the number of multiplications required may be obtained by considering the sixty four constants used by the one-dimensional discrete cosine transform as set forth in Equation (5). Ignoring the factor (1/2) in Equation (5), as well as the values of C(u), consider only the cosval(i,u) values. From the original equations: ##EQU6## where M=(2i+1)u.
Referring now to Table I, the values of M for all sixty-four constants are set forth,
TABLE I ______________________________________ 0 1 2 3 4 5 6 7 0 3 6 9 12 15 18 21 0 5 10 15 20 25 30 35 0 7 14 21 28 35 42 49 0 9 18 27 36 45 54 63 0 11 22 33 44 55 66 77 0 13 26 39 52 65 78 91 0 15 30 45 60 75 90 105 ______________________________________
Since M=(2i+1)u, the values in row i of Table I are multiples (2i+1). Each of these corresponds to the cosine of M*.pi./16. But the cosine function is periodic and symmetrical in various ways as well. Thus, the cosine of M*.pi./16, for any M, is equal to either the positive or negative cosine of M*.pi./16 with M between zero and seven. For example, cos(35*.pi./16)=cos(3*.pi./16), since the cosine function repeats every thirty-two values of M. Thirty values of M corresponds to 2*p(i). Thus, the array of Table I may be reduced to the cosines set forth in Table II.
TABLE II ______________________________________ 0 1 2 3 4 5 6 7 0 3 6 -7 -4 -1 -2 -5 0 5 -6 -1 -4 7 2 3 0 7 -2 -5 4 3 -6 -1 0 -7 -2 5 4 -3 -6 1 0 -5 -6 1 -4 -7 2 -3 0 -3 -6 7 -4 1 -2 5 0 -1 2 -3 4 -5 6 -7 ______________________________________
Within Table II, a negative sign before a value of M indicates that a constant is -cos(M*.pi./16) rather than cos(M*.pi./16). From the contents of Table II, it can be seen that there is a great deal of redundancy when computing the one-dimensional discrete cosine transform using Equation (1). For example, when computing each p(i) value there is a multiply f(4) by the same constant (M=4) all eight times in the fifth column of Table II. These multiplies differ only by their sign. Similarly, f(0), set forth in the first column of Table II, is always multiplied by M=0 when performing this transform. Additionally, f(2) is multiplied by only two different constants, either by M=2 or by M=6.
Thus in this method all eight p(i) values are calculated at the same time using one calculation. Only the minimum number of multiplications of f(u) values by constants are performed and an overall savings are achieved by reusing these intermediate results. Algorithms of this type are called fast cosine transform algorithms, and are analogous to the well-known fast fourier transform.
Further improvement may be obtained using a method requiring eleven multiplications instead of sixty-four for a one-dimensional discrete cosine transform. Such a method is taught in Loeffler, Ligtenberg, and Moschytz in "Practical Fast 1-D DCT Algorithms With 11 Multiplications," IEEE Transactions, 1989. It can be proven mathematically that eleven multiplications is the minimum possible for a one-dimensional fast cosine transform as taught in the prior art. Using this improvement a two-dimensional discrete cosine transform only requires 11*16/64=2.75 multiplications per pixel.
Thus, many improvements upon basic forward discrete cosine transform 12 and inverse discrete cosine transform 14 are known in the prior art. However, in all of these prior art methods, in spite of the benefits of the improved discrete cosine transforms 12, 14 for image compression, many clock cycles are still required to perform the remaining multiplications. Calculations of prior art forward discrete cosine transform 12 and inverse discrete cosine transform 14 which eliminate some multiplications are still very time consuming because of these remaining multiplications. It will be understood that prior art improved discrete cosine transforms are still very expensive because on most processors a multiply instruction is significantly more time consuming than an add operation or a shift operation. On processors that do not have a hardware multiplier, the difference in the number of cycles is very dramatic. This difference may be as much as a factor of ten or twenty. It will be understood by those skilled in the art that the computational time requirements of inverse discrete cosine transform 14 are very similar to the computational time requirements of forward discrete cosine transform 12.