The present invention relates to a method and an apparatus for digital data compression and, more particularly, but not exclusively to a method and an apparatus for digital data compression of images, audio, and video using wavelet-based transformations.
Digital multimedia includes video, images and audio data, typically involving a large amount of data. For instance, a twenty-second digitized movie has a data size of 650 Mbytes, and a two hours worth of uncompressed video data would occupy 360 compact disks. Similarly, transmitting a two-hour movie having an uncompressed data format at a rate of 128 kbps would take 169 days.
Video and image compression have been widely adopted since the advent of digital multimedia technology and the popularization of DVD, web images, mp3 and digital cameras. Several image compression standards exist to compress images, such as JPEG and JPEG2000. Several video compression standards have been adopted for a variety of different applications. These include the International Standard Organization (ISO) video compression formats MPEG-1, MPEG-2 and MPEG-4, developed by the Moving Picture Experts Group, and the ITU H.261 and H.263 video standards. These standards came into being at different times when the stage of multimedia technology development had different needs. For example, the MPEG-1 standard supports a 352×240 resolution and an input frame rate of 30 frames per second (FPS), and produces video quality slightly below the quality of conventional VCR videos, whereas MPEG-2 supports up to a 1280×720 resolution and an input frame rate of 60 FPS and produces video quality sufficient for all major TV standards, including HDTV, with full CD-quality audio. MPEG-2 is also used with DVD-ROM since it has a relatively high compression ratio, defined simply as the ratio of compressed to uncompressed data. The MPEG-4 standard is based on MPEG-1 and MPEG-2 technology and is designed to transmit video and images over a narrower bandwidth. MPEG-4 further provides the mixing of video with text, graphics and 2-D and 3-D animation layers. H.261 and H.263 are mainly developed for teleconferencing applications that require both the encoder and decoder to operate in real time.
However, both the JPEG and the MPEG standards have a number of known drawbacks and limitations. The key to the JPEG algorithm is a discrete cosine transform (DCT) of N×N blocks. Each block is computed using the DCT, the results are quantized, and then entropy coded. Though information can be efficiently coded, using a DCT based compression, the limitations of the DCT basis Cosine functions, are that although the transform is of a global nature, the JPEG algorithm uses frequency analysis only for small areas of the image and therefore results in blocking artifacts at low compression rates. Thus, the overhead for large image size is increased. MPEG, which is based on JPEG images, suffers from the same drawbacks and limitations. Moreover, the MPEG has a number of additional limitations. For example, the MPEG compression causes motion artifacts, which are blocking artifacts in scenes with high motion. In addition, the MPEG compression causes signal degradation as single erroneous bits may affect the visual quality of large areas of the image. Moreover, the MPEG compression has high complexity due to motion estimation and compensation.
Furthermore, the existing video compression standard, such as MPEG-2 defines scalable profiles, which exploit classic DCT-based schemes with motion compensation. Unfortunately, spatial scalability as proposed by the MPEG-2 coding standard is inefficient because the bitrate overhead is too large. Additionally, the solutions defined in MPEG-2 do not allow flexible allocation of the bitrate. There is a great demand for flexible bit allocation to individual layers, for example for fine granularity scalability (FGS), which is also already proposed for MPEG-4, where the fine granular enhancement layers are intra-frame encoded.
A known compression method that is devoid of most of the aforementioned limitation is the discrete wavelet transform (DWT)-based compression. DWT-based compression is a form of finite impulse response filter. Most notably, the DWT is used for signal coding, where the properties of the transform are exploited to represent a discrete signal in a more redundant form, such as a Laplace-like distribution, often as a preconditioning for data compression. DWT is widely used for handling video and image compression to recreate faithfully the original images under high compression ratios. DWT produces as many coefficients as there are pixels in the image. These coefficients can be compressed more easily because the information is statistically concentrated in just a few coefficients. During the compression, process coefficients are quantized and the quantized values are entropy encoded and preferably run length encoded. The lossless nature of DWT results in zero data loss or modification on decompression to support better image quality under higher compression ratios at low-bit rates and highly efficient hardware implementation.
The principle behind the wavelet transform is to hierarchically decompose the input signals into a series of successively lower resolution coarse signals and their associated detail signals. At each level, the coarse signals and detailed signals contain the information necessary for reconstruction back to the next higher resolution level. One-dimensional DWT processing can be described in terms of a filter bank, wavelet transforming a signal is like passing the signal through this filter bank wherein an input signal is analyzed in both low and high frequency bands. The outputs of the different filter stages are the wavelet and scaling function transform coefficients. The decompression operation is the inverse of the compression operation. Finally, the inverse wavelet transform is applied to the de-quantized wavelet coefficients. This produces the pixel values that are used to create the image.
In particular, the discrete wavelet transform is usually related to two pairs of filters. One pair comprises lowpass {{tilde over (h)}k} and highpass {{tilde over (g)}k} analysis filters, and the other pair comprises lowpass {hk} and highpass {gk} synthesis filters. The lowpass filters come from a respective pair of biorthogonal functions. One biorthogonal function is an analysis scaling function {tilde over (Φ)}(x) and the other is a synthesis scaling function Φ(x). {tilde over (Φ)}(x) and Φ(x) are defined by the following respective refinable equations:
                                                        ϕ              ~                        ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                                                                              ⁢                                                                    h                    ~                                    k                                ⁢                                                      ϕ                    ~                                    ⁡                                      (                                                                  2                        ⁢                                                                                                  ⁢                        x                                            -                      k                                        )                                                                                      ,                                  ⁢                              ϕ            ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                                                                              ⁢                                                h                  k                                ⁢                                                      ϕ                    ⁡                                          (                                                                        2                          ⁢                                                                                                          ⁢                          x                                                -                        k                                            )                                                        .                                                                                        (        3.1        )            
In the same manner, the highpass filters come from another pair of biorthogonal functions. The first biorthogonal function is an analysis wavelet function {tilde over (ψ)}(x) and the other is a synthesis wavelet function ψ(x). {tilde over (ψ)}(x) and ψ(x) are defined by the following related refinable equations:
                                                        ψ              ~                        ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                                                                              ⁢                                                                    g                    ~                                    k                                ⁢                                                      ψ                    ~                                    ⁡                                      (                                                                  2                        ⁢                                                                                                  ⁢                        x                                            -                      k                                        )                                                                                      ,                                  ⁢                              ψ            ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                                                                              ⁢                                                g                  k                                ⁢                                                      ψ                    ⁡                                          (                                                                        2                          ⁢                                                                                                          ⁢                          x                                                -                        k                                            )                                                        .                                                                                        (        3.2        )            
A more elaborate mathematical discussion on the filters can be found in David F. Walnut, An Introduction to Wavelet Analysis, Birkhauser, 2001, which is herein incorporated in its entirety by reference into the specification.
Biorthogonality then implies the following conditions:
                                                                                                                                    ∑                      k                                                                                                            ⁢                                                                  h                        k                                            ⁢                                                                        h                          ~                                                                          k                          -                                                      2                            ⁢                                                                                                                  ⁢                            l                                                                                                                                =                                                                                    ∑                        k                                                                                                                      ⁢                                                                        g                          k                                                ⁢                                                                              g                            ~                                                                                k                            -                                                          2                              ⁢                                                                                                                          ⁢                              l                                                                                                                                            =                                          δ                      l                                                                      ,                                                                                                                                                    ∑                      k                                                                                                            ⁢                                                                  h                        k                                            ⁢                                                                                                    g                            ~                                                                                k                            -                                                                                                    2                          ⁢                                                                                                          ⁢                          l                                                                                                      =                                                                                    ∑                        k                                                                                                                      ⁢                                                                        g                          k                                                ⁢                                                                              h                            ~                                                                                k                            -                                                          2                              ⁢                                                                                                                          ⁢                              l                                                                                                                                            =                    0                                                  ,                                                    ⁢                                  ⁢                              δ            l                    =                      {                                                            1                                                                      l                    =                    0                                                                                                0                                                                      l                    ≠                    0.                                                                                                          (        3.3        )            which are the equivalent of the perfect reconstruction property in signal analysis, see M. Vetterli and J. Kovacevic, Wavelets and Subband Coding, Prentice Hall, 1995, which is herein incorporated in its entirety by reference into the specification.The DWT and IDWT can now be defined to move from the signal domain to the wavelet domain and vice versa. In particular, based on the input signal {c(l) (i) ε R} and the analysis filters as above, the DWT can be defined as follows:
                                                        c                              (                0                )                                      ⁡                          (              i              )                                =                                    ∑              j                                                                    ⁢                                                            h                  ~                                                  j                  -                                      2                    ⁢                                                                                  ⁢                    i                                                              ⁢                                                c                                      (                    1                    )                                                  ⁡                                  (                  j                  )                                                                    ,                                  ⁢                                            d                              (                0                )                                      ⁡                          (              i              )                                =                                    ∑              j                                                                    ⁢                                                            g                  ~                                                  j                  -                                      2                    ⁢                                                                                  ⁢                    i                                                              ⁢                                                c                                      (                    1                    )                                                  ⁡                                  (                  j                  )                                                                    ,                            (        3.4        )            where the c(0)(i) denotes the lower resolution representation of the original signal and d(0)(i) denotes additional detailed parts of the signal. Conversely, given the lower resolution representation {c(0) (i) ε R} and the additional detailed parts {d(0) (i) ε R}, we define the IDWT as:
                                                        c                              (                1                )                                      ⁡                          (                              2                ⁢                                                                  ⁢                i                            )                                =                                                    ∑                j                                                                              ⁢                                                h                                      2                    ⁢                                          (                                              i                        -                        j                                            )                                                                      ⁢                                                      c                                          (                      0                      )                                                        ⁡                                      (                    j                    )                                                                        +                                          ∑                j                                                                              ⁢                                                g                                      2                    ⁢                                          (                                              i                        -                        j                                            )                                                                      ⁢                                                      d                                          (                      0                      )                                                        ⁡                                      (                    j                    )                                                                                      ,                                  ⁢                                            c                              (                1                )                                      ⁡                          (                                                2                  ⁢                                                                          ⁢                  i                                +                1                            )                                =                                                    ∑                j                                                                              ⁢                                                h                                                            2                      ⁢                                              (                                                  i                          -                          j                                                )                                                              +                    1                                                  ⁢                                                      c                                          (                      0                      )                                                        ⁡                                      (                    j                    )                                                                        +                                          ∑                j                                                                              ⁢                                                g                                                            2                      ⁢                                              (                                                  i                          -                          j                                                )                                                              +                    1                                                  ⁢                                                                            d                                              (                        0                        )                                                              ⁡                                          (                      j                      )                                                        .                                                                                        (        3.5        )            
In known applications, such as encoders, decoders, compressor, and decompressor, the compressed input comprises highly correlated signals, such as voice samples, audio samples pixel based images, and video, which are captured by the wavelet coefficients and computed using the DWT. The transformed signals can be efficiently compressed, usually in a lossy way, yielding an encoded bit-stream that can be transmitted over a computer network such as the Internet or stored on a medium such as a DVD disk. Applications, which are designed to decode such transformed signals, reconstruct the encoded bit-stream using the IDWT. Though the resulting signal is not mathematically the same as the original, it can be used for many purposes.
U.S. Pat. No. 6,570,510, issued on May 27, 2003, illustrates an example of such application. The patent discloses an apparatus that comprises a DWT engine, a code block manager, and an entropy encoder. The code block manager comprises a controller, which losslessly compresses the transform coefficients and stores them in code block storage for buffering. The entropy coder comprises entropy encoders, each comprising a decoder for decoding the losslessly compressed transformed coefficients prior to entropy encoding.
One major advantage of the DWT based compression over the DCT based compression is the temporal or spatial locality of the base functions. Such base functions require less computational complexity than is needed for the fast Fourier transformation (FFT) functions of the DCT based compression.
However, wavelet compression has a number of drawbacks and limitations. For example, wavelets, which are typically used in video compression algorithms, are typically real valued, compactly supported, and continuous 2-band wavelets. However, no such wavelets exist which are both orthogonal and linear phase symmetric, It is important to have a wavelet that provides both qualities as orthogonally is important for preserving the overall energy of the transformed coefficients, while symmetry is important for preventing visual artifacts. Moreover, while the wavelet representation is optimal for one-dimensional signals, the current separable methods for compressing higher dimensional signals, such as image and video, fail to detect directional contours, such as edges and surfaces comprising the digital data.
There is thus a widely recognized need for, and it would be highly advantageous to have, a DWT based compression method and system devoid of the above limitations.