The present invention relates to a method and an apparatus for digital data compression and, more particularly, but not exclusively to a method and an apparatus for digital data compression of images, audio, and video using wavelet-based transformations.
Digital multimedia includes video, images and audio data, typically involving a large amount of data. For instance, one second of HD (1920×1080, 30 fps) digitized video has a data size of ≈200 M (mega bytes), and 90 minutes of such a movie would occupy ≈Tera (1012) bytes. Similarly, transmitting such a movie over a 3 Mbps high broadband connection would take about a month.
Video and image compression have been widely adopted since the advent of digital multimedia technology and the popularization of DVD, web images, mp3 and digital cameras. Several image compression standards exist to compress images, such as JPEG and JPEG2000. Several video compression standards have been adopted for a variety of different applications. These include the International Standard Organization (ISO) video compression formats MPEG-1, MPEG-2 and MPEG-4, developed by the Moving Picture Experts Group, and the ITU H.261 and H.263 video standards. These standards came into being at different times when the stage of multimedia technology development had different needs. For example, the MPEG-1 standard supports a 352×240 resolution and an input frame rate of 30 frames per second (FPS), and produces video quality slightly below the quality of conventional VCR videos, whereas MPEG-2 supports up to a 1280×720 resolution and an input frame rate of 60 FPS and produces video quality sufficient for all major TV standards, including HDTV, with full CD-quality audio. MPEG-2 is also used with DVD-ROM since it has a relatively high compression ratio, defined simply as the ratio of compressed to uncompressed data. The MPEG-4 standard is based on MPEG-1 and MPEG-2 technology and is designed to transmit video and images over a narrower bandwidth. MPEG-4 further provides the mixing of video with text, graphics and 2-D and 3-D animation layers. H.261 and H.263 are mainly developed for teleconferencing applications that require both the encoder and decoder to operate in real time.
However, both the JPEG and the MPEG standards have a number of known drawbacks and limitations. The key to the JPEG algorithm is a discrete cosine transform (DCT) of N×N blocks. Each block is computed using the DCT, the results are quantized, and then entropy coded. Though information can be efficiently coded, using a DCT based compression, the limitations of the DCT basis Cosine functions, are that although the transform is of a global nature, the JPEG algorithm uses frequency analysis only for small areas of the image and therefore results in blocking artifacts at low compression rates. Thus, the overhead for large image size is increased. MPEG, which is based on JPEG images, suffers from the same drawbacks and limitations. Moreover, the MPEG has a number of additional limitations. For example, the MPEG compression causes motion artifacts, which are blocking artifacts in scenes with high motion. In addition, the MPEG compression causes signal degradation as single erroneous bits may affect the visual quality of large areas of the image. Moreover, the MPEG compression has high complexity due to motion estimation and compensation.
Furthermore, the existing video compression standard, such as MPEG-2 defines scalable profiles, which exploit classic DCT-based schemes with motion compensation. Unfortunately, spatial scalability as proposed by the MPEG-2 coding standard is inefficient because the bitrate overhead is too large. Additionally, the solutions defined in MPEG-2 do not allow flexible allocation of the bitrate. There is a great demand for flexible bit allocation to individual layers, for example for fine granularity scalability (FGS), which is also already proposed for MPEG-4, where the fine granular enhancement layers are intra-frame encoded.
Another known compression method is the discrete wavelet transform (DWT)-based compression. DWT-based compression is a form of finite impulse response filter. Most notably, the DWT is used for signal coding, where the properties of the transform are exploited to represent a discrete signal in a more redundant form, such as a Laplace-like distribution, often as a preconditioning for data compression. DWT is widely used for handling video and image compression to recreate faithfully the original images under high compression ratios. DWT produces as many coefficients. These coefficients can be compressed more easily because the information is statistically concentrated in just a few coefficients. During the compression, process coefficients are quantized and the quantized values are entropy encoded. The lossless nature of DWT results in zero data loss or modification on decompression to support better image quality under higher compression ratios at low-bit rates and highly efficient hardware implementation.
The principle behind the wavelet transform is to hierarchically decompose the input signals into a series of successively lower resolution coarse signals and their associated detail signals. At each level, the coarse signals and detailed signals contain the information necessary for reconstruction back to the next higher resolution level. One-dimensional DWT processing can be described in terms of a filter bank, wavelet transforming a signal is like passing the signal through this filter bank wherein an input signal is analyzed in both low and high frequency bands. The outputs of the different filter stages are the wavelet and scaling function transform coefficients. The decompression operation is the inverse of the compression operation. Finally, the inverse wavelet transform is applied to the de-quantized wavelet coefficients. This produces the pixel values that are used to create the image.
In particular, the discrete wavelet transform is usually related to two pairs of filters. One pair comprises lowpass {{tilde over (h)}k} and highpass {{tilde over (g)}k} analysis filters, and the other pair comprises lowpass {hk} and highpass {gk} synthesis filters. The lowpass filters come from a respective pair of biorthogonal functions. One biorthogonal function is an analysis scaling function {tilde over (Φ)}(x) and the other is a synthesis scaling function Φ(x). {tilde over (Φ)}(x) and Φ(x) are defined by the following respective refinable equations:
                                                        ϕ              ~                        ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                            ⁢                                                                    h                    ~                                    k                                ⁢                                                      ϕ                    ~                                    ⁡                                      (                                                                  2                        ⁢                        x                                            -                      k                                        )                                                                                      ,                                  ⁢                              ϕ            ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                            ⁢                                                h                  k                                ⁢                                                      ϕ                    ⁡                                          (                                                                        2                          ⁢                          x                                                -                        k                                            )                                                        .                                                                                        (        3.1        )            
In the same manner, the highpass filters come from another pair of biorthogonal functions. The first biorthogonal function is an analysis wavelet function {tilde over (ψ)}(x) and the other is a synthesis wavelet function ψ(x). {tilde over (ψ)}(x) and ψ(x) are defined by the following related refinable equations:
                                                        ψ              ~                        ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                            ⁢                                                                    g                    ~                                    k                                ⁢                                                      ψ                    ~                                    ⁡                                      (                                                                  2                        ⁢                        x                                            -                      k                                        )                                                                                      ,                                  ⁢                              ψ            ⁡                          (              x              )                                =                                    2                        ⁢                                          ∑                k                            ⁢                                                g                  k                                ⁢                                                      ψ                    ⁡                                          (                                                                        2                          ⁢                          x                                                -                        k                                            )                                                        .                                                                                        (        3.2        )            
A more elaborate mathematical discussion on the filters can be found in David F. Walnut, An Introduction to Wavelet Analysis, Birkhauser, 2001, which is herein incorporated in its entirety by reference into the specification.
Biorthogonality then implies the following conditions:
                                                                                                                                                            ∑                        k                                            ⁢                                                                        h                          k                                                ⁢                                                                              h                            ~                                                                                k                            -                                                          2                              ⁢                              l                                                                                                                                            =                                                                                            ∑                          k                                                ⁢                                                                              g                            k                                                    ⁢                                                                                    g                              ~                                                                                      k                              -                                                              2                                ⁢                                l                                                                                                                                                        =                                              δ                        l                                                                              ,                                                                                                                                                                        ∑                        k                                            ⁢                                                                        h                          k                                                ⁢                                                                              g                            ~                                                                                k                            -                                                          2                              ⁢                              l                                                                                                                                            =                                                                                            ∑                          k                                                ⁢                                                                              g                            k                                                    ⁢                                                                                    h                              ~                                                                                      k                              -                                                              2                                ⁢                                l                                                                                                                                                        =                      0                                                        ,                                                              ⁢                                          ⁢                      δ            l                          =                  {                                                                      1                                                                      l                    =                    0                                                                                                0                                                                      l                    ≠                    0                                                                        .                                              (        3.3        )            which are the equivalent of the perfect reconstruction property in signal analysis, see M. Vetterli and J. Kovacevic, Wavelets and Sub-band Coding, Prentice Hall, 1995, which is herein incorporated in its entirety by reference into the specification. The DWT and IDWT can now be defined to move from the signal domain to the wavelet domain and vice versa. In particular, based on the input signal {c(1)(i)εR} and the analysis filters as above, the DWT can be defined as follows:
                                                        c                              (                0                )                                      ⁡                          (              i              )                                =                                    ∑              j                        ⁢                                                            h                  ~                                                  j                  -                                      2                    ⁢                    i                                                              ⁢                                                c                                      (                    1                    )                                                  ⁡                                  (                  j                  )                                                                    ,                                  ⁢                                            d                              (                0                )                                      ⁡                          (              i              )                                =                                    ∑              j                        ⁢                                                            g                  ~                                                  j                  -                                      2                    ⁢                    i                                                              ⁢                                                c                                      (                    1                    )                                                  ⁡                                  (                  j                  )                                                                    ,                            (        3.4        )            where the c(0)(i) denotes the lower resolution representation of the original signal and d(0)(i) denotes additional detailed parts of the signal. Conversely, given the lower resolution representation {c(0)(i)εR} and the additional detailed parts {d(0)(i)εR}, the IDWT is defined as:
                                                        c                              (                1                )                                      ⁡                          (                              2                ⁢                i                            )                                =                                                    ∑                j                            ⁢                                                h                                      2                    ⁢                                          (                                              i                        -                        j                                            )                                                                      ⁢                                                      c                                          (                      0                      )                                                        ⁡                                      (                    j                    )                                                                        +                                          ∑                j                            ⁢                                                g                                      2                    ⁢                                          (                                              i                        -                        j                                            )                                                                      ⁢                                                      d                                          (                      0                      )                                                        ⁡                                      (                    j                    )                                                                                      ,                                  ⁢                                            c                              (                1                )                                      ⁡                          (                                                2                  ⁢                  i                                +                1                            )                                =                                                    ∑                j                            ⁢                                                h                                                            2                      ⁢                                              (                                                  i                          -                          j                                                )                                                              +                    1                                                  ⁢                                                      c                                          (                      0                      )                                                        ⁡                                      (                    j                    )                                                                        +                                          ∑                j                            ⁢                                                g                                                            2                      ⁢                                              (                                                  i                          -                          j                                                )                                                              +                    1                                                  ⁢                                                                            d                                              (                        0                        )                                                              ⁡                                          (                      j                      )                                                        .                                                                                        (        3.5        )            
In known applications, such as encoders, decoders, compressor, and decompressor, the compressed input comprises highly correlated signals, such as voice samples, audio samples, pixel based images, and video, which are captured by the wavelet coefficients and computed using the DWT. The transformed signals can be efficiently compressed, usually in a lossy way, yielding an encoded bit-stream that can be transmitted over a computer network such as the Internet or stored on a medium such as a DVD disk. Applications, which are designed to decode such transformed signals, reconstruct the encoded bit-stream using the IDWT. Though the resulting signal is not mathematically the same as the original, it can be used for many purposes.
International Pat. App. Pub. No. WO2007/083312 published on July 2007, which is incorporated herein by reference, discloses methods and apparatuses for compressing and decompressing digital data. The method for compressing digital data comprises a number of steps: a) generating a vector-valued dataset according to the digital data, b) transforming the vector-valued dataset into multiwavelet coefficients, and c) entropically coding the multiwavelet coefficients. The method for decompressing digital data is substantially made up of the same steps as the method for compressing digital data but functioning in a reverse manner.