Compact image representation is a well-known problem. Typical techniques proposed over the years include non-linear techniques like vector quantization where an image is represented by its index in a vector dictionary, and linear representations (e.g., wavelet transform based representations, Fourier transform based representations, Discrete Cosine Transform (DCT) based representations, etc., where an image is linearly transformed and represented in terms of its linear transform coefficients. Linear representations are often times augmented with simple non-linear processing in order to further extend their effectiveness.
One of the most important properties of compact representations is their ability to approximate an image using few parameters. The approximation rate of a representation can be obtained as the reduction of representation error as more parameters are used in the representation. For example, this rate can be obtained by calculating the reduction of the mean squared error between the original image and its approximation using the given representation as more parameters are added to the representation. With some exceptions, representations with a high approximation rate (smaller error with a given number of parameters) are expected to yield better performance in compression, denoising, and various other applications.
Solutions for linear representations achieving an optimal or near optimal approximation rate for one dimensional (I-D) signals containing isolated singularities are known. For example, it is known that linear transforms based on compact wavelets with vanishing moments can achieve near optimal approximation rates. However, straightforward generalizations of these representations to two dimensions (e.g., two dimensional (2-D) wavelet transforms) for use with two dimensional images are known to be suboptimal. For purposes herein, these straightforward generalizations are referred to as first generation linear representations.
There are many first generation linear representations and compression algorithms based on first generation linear representations. However, these solutions are known to be suboptimal on images and video that manifest singularities along curves. That is, first generation representations and techniques based on them result in too many coefficients or parameters around singularities. While some compression techniques are very good at encoding coefficients, they result in suboptimal performance since the first generation representations they use produce too many coefficients to encode.
In two dimensional images, singularities are along curves whereas the first generation representations can only handle point singularities and are exponentially suboptimal in two dimensions. FIGS. 1A-C illustrates the use of compact wavelets for signals of various dimensions. Referring to FIG. 1A, compact wavelets are shown leading to near optimality for 1-D signals, and FIG. 1B illustrates compact wavelets leading to near optimality for 2-D signals with point singularities. However, as indicated in FIG. 1C, compact wavelets are suboptimal for 2-D signals with singularities over curves. That is, the signal in FIG. 1C manifests a singularity along a curve and over such signals, the two dimensional wavelet transform does not produce near optimal approximation rates. Interestingly, current state-of-the-art image compression techniques are based on these first generation representations. Hence, it is well-known in the research community that current state-of-the-art image compression techniques are suboptimal.
Recently, second generation representations that are aimed at improving the suboptimality of the first generation representations have been introduced. These techniques are typically designed using idealized mathematical models of images defined over continuous domains. Digital images, on the other hand, are defined on a discrete grid and fail to satisfy many of the core assumptions of these methods. Hence, these techniques currently cannot go beyond state-of-the-art first generation techniques even though they should be exponentially better than first generation techniques.
Some of the best second generation representations, such as complex wavelets, are expansive/overcomplete, meaning they result in more parameters than image pixels. While many of these extra parameters are small, compression techniques that effectively (in a rate-distortion sense) take advantage of compaction in such an expansive domain are yet to be developed.
Other representations more in tune with the properties of digital images and compression algorithms based on these representations exist. However, their performance over first generation techniques is still lacking.
Some compression algorithms also try to improve performance around singularities by using directional prediction (see, for example, the INTRA frame coding method used in Joint Video Team of ITU-T and ISO/IEC JTC 1, “Draft ITU T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H264 | ISO/IEC 14496-10 AVC),” Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-G050, March 2003). Such solutions are only applicable over piecewise smooth image models with linear or line-like singularities. Furthermore, as they try to predict large regions using a limited class of predictors, pixels away from the boundary of available data are predicted incorrectly. Similarly, when singularities are along curves rather than just line-like or when image statistics are not locally smooth, these methods fail.
Methods that generalize directional predictors by deploying transforms over directional lines are also limited to line-like singularities. Furthermore, they need to design their compression algorithms over blocks of varying sizes, which results in inefficiencies when the resulting coefficients are encoded with entropy coders.