1. Field of the Invention
The present invention relates generally to image and video processing and pertains more particularly to a multiplier-free implementation of an approximation of the discrete cosine transform (DCT).
2. Discussion of the Prior Art
When high-quality images must be compressed to save memory or transmission requirements, it is common practice to first transform the images to another space where the information can be represented more compactly. This is usually done block-by-block with a linear transformation (matrix multiplication). A typical arrangement is to perform eight-point transforms on row segments of eight pixels and then to perform eight-point transforms on the eight element column segments of this row transformed image. Equivalently, a single sixty-four-point transform could be performed on a pixel block of sixty-four pixels arranged in an eight-by-eight block.
One common choice for a one dimensional transform is the eight-point DCT matrix S={s(k,n)}k,n=07, where                               s          ⁡                      (                          k              ,              n                        )                          =                                            c              ⁡                              (                k                )                                      2                    ⁢          cos          ⁢                      xe2x80x83                    ⁢                      (                                                                                                      2                      ⁢                      n                                        +                    1                                    16                                ·                k                            ⁢                              xe2x80x83                            ⁢              π                        )                                              (        1        )            
and c(0)=1/{square root over (2)} and c(k)=1 for k greater than 0. Correspondingly, the eight-by-eight two-dimensional DCT transforms a block in the spatial domain, x={x(n,m)}n,m=07, into a matrix of frequency components, X={X(k,l)}k,l=07, according to the following equation                               X          ⁡                      (                          k              ,              l                        )                          =                                            c              ⁡                              (                k                )                                      2                    ⁢                      xe2x80x83                    ⁢                                    c              ⁡                              (                l                )                                      2                    ⁢                                    ∑                              n                =                0                            7                        ⁢                          xe2x80x83                        ⁢                                          ∑                                  m                  =                  0                                7                            ⁢                                                x                  ⁡                                      (                                          n                      ,                      m                                        )                                                  ⁢                                  xe2x80x83                                ⁢                cos                ⁢                                  xe2x80x83                                ⁢                                  (                                                                                                                                          2                            ⁢                            n                                                    +                          1                                                16                                            ·                      k                                        ⁢                                          xe2x80x83                                        ⁢                    π                                    )                                ⁢                cos                ⁢                                  xe2x80x83                                ⁢                                  (                                                                                                                                          2                            ⁢                            m                                                    +                          1                                                16                                            ·                      l                                        ⁢                                          xe2x80x83                                        ⁢                    π                                    )                                                                                        (        2        )            
The inverse transform is given by the following equation                               x          ⁡                      (                          n              ,              m                        )                          =                              ∑                          k              =              0                        7                    ⁢                      xe2x80x83                    ⁢                                    ∑                              l                =                0                            7                        ⁢                                                            c                  ⁡                                      (                    k                    )                                                  2                            ⁢                              xe2x80x83                            ⁢                                                c                  ⁡                                      (                    l                    )                                                  2                            ⁢                              X                ⁡                                  (                                      k                    ,                    l                                    )                                            ⁢              cos              ⁢                              xe2x80x83                            ⁢                              (                                                                                                                              2                          ⁢                          n                                                +                        1                                            16                                        ·                    k                                    ⁢                                      xe2x80x83                                    ⁢                  π                                )                            ⁢              cos              ⁢                              xe2x80x83                            ⁢                              (                                                                                                                              2                          ⁢                          m                                                +                        1                                            16                                        ·                    l                                    ⁢                                      xe2x80x83                                    ⁢                  π                                )                                                                        (        3        )            
Then, in matrix form,
X=SxSt,xe2x80x83xe2x80x83(4)
where the superscript t denotes matrix transposition. Similarly, let the superscript xe2x88x92t denote transposition of the inverse matrix. Then,
x=Sxe2x88x921XSxe2x88x92t=StXS,xe2x80x83xe2x80x83(5)
where the second equality follows from the unitarity of S.
One of ordinary skill in the art will realize that the DCT is the basis for standard image and video compression/decompression schemes such as JPEG, MPEG-1, MPEG-2, H.261, and H.263. Compression is performed using the DCT while decompression is performed using the inverse DCT (IDCT). If preferred, it is possible to sacrifice accuracy for speed in the computation of the DCT and IDCT. In some applications, it is necessary that both compression and decompression be fast. For example, given the demands of real-time playback and displaying capabilities, the overall compression/decompression scheme needs to be fast. In other applications, it is sufficient if only one or the other of the compression or decompression is fast. For example, in those situations where compression is performed only once, whereas decompression might be carried out many times, it is important that the decompression be fast. For simplicity of presentation, it is this latter asymmetric case that is presented here. However, one of ordinary skill in the art will realize that the principles disclosed apply equally to the symmetric case.
Turning first to FIG. 1, a flow diagram of the steps of MPEG decoding is shown. MPEG decoding is shown strictly as an example. One of ordinary skill in the art will be familiar with the steps in encoding/decoding in each of the standard schemes and realize that the discussion that follows applies to each. The process of decoding operates on a quantity of compressed data and begins with step 10. At step 10, the compressed data undergoes header decoding where general information for decoding is extracted, including identifying which of the available tables is to be used for decoding. Next at step 12, the data undergoes Huffman decoding where a table is used to convert the MPEG coefficients to the DCT coefficients. At step 14 the data undergoes inverse quantizing where a table is used to scale the data. Then at step 16, the data undergoes IDCT where the data is transformed back from the frequency domain to the spatial domain. At step 18, the data undergoes motion compensation where the data is converted from prediction residuals back to a valid video stream. Finally, at step 20, the data is displayed for the viewer.
The computation of the IDCT at step 16 is one of the major bottlenecks in the process of decoding DCT-based compression images and video. In an effort to increase speed, an approximation of the IDCT can be employed. This will introduce some errors, but with careful design the cost of the errors is small in comparison to the benefit of the increased speed. For compression, an approximate DCT can be employed as the situation dictates.
One example of an approximate DCT is presented in U.S. Pat. No. 5,129,015 issued to Allen et al., entitled xe2x80x9cApparatus And Method For Compressing Still Images Without Multiplication.xe2x80x9d This patent discloses an approximate DCT, based on a transform developed by Chen, which requires 224 shifts and 608 additions for an eight-by-eight DCT. The total number of operations is therefore 832.
Another example of an approximate DCT is presented in the book entitled IMAGE AND VIDEO COMPRESSION STANDARDS by Bhaskaran et al. In Chapter 3, Bhaskaran discloses an approximate DCT which requires 192 shifts and 576 additions for an eight-by-eight DCT. The total number of operations is therefore 768.
By comparison, the number of operations required for the exact Winograd DCT is 80 multiplications and 464 additions. If one assumes that a multiplication has equal weight to three basic operations on average, then the total number of operations is therefore 704.
Those in the image and video processing and compression field are driven by the desire for a faster approximate DCT and IDCT requiring fewer operations.
It is the primary object of the present invention to provide an improved multiplier-free implementation of an approximation of the DCT used in image and video processing that further reduces the number of operations in the performance of the DCT.
In accordance with the primary aspect of the present invention, image and video processing is done with no multiplications and a fewer number of operations through the application of a modified Arai, Agui, and Nakajima (AAN) scheme for eight-point DCT.