This invention relates to image compression, and more particularly, to image and video compression methods and devices.
Recently, Digital Still Cameras (DSCs) have become a very popular consumer appliance appealing to a wide variety of users ranging from photo hobbyists, web developers, real estate agents, insurance adjusters, photo-journalists to everyday photography enthusiasts. Recent advances in large resolution CCD arrays coupled with the availability of low-power digital signal processors (DSPs) has led to the development of DSCs that have the resolution and quality offered by traditional film cameras. These DSCs offer several additional advantages compared to traditional film cameras in terms of data storage, manipulation, and transmission. The digital representation of captured images enables the user to easily incorporate the images into any type of electronic media and transmit them over any type of network; see FIG. 10. The ability to instantly view and selectively store captured images provides the flexibility to minimize film waste and instantly determine if the image needs to be captured again. With its digital representation the image can be corrected, altered, or modified after its capture, and stored on memory cards for battery-powered cameras.
Further, DSCs can be extended to capture video clips (short video sequences) and to compress (sequences of) images with methods such as JPEG or JPEG2000. FIGS. 9a-9b depict functions and blocks of a digital camera system with the image compression block providing JPEG, JPEG2000, and/or other compressions. JPEG provides compression by transforming 8×8 blocks of pixels into the frequency domain with an 8×8 DCT (discrete cosine transform) and then quantizing the DCT coefficient blocks, scanning the 8×8 quantized coefficients into a one-dimensional sequence, and variable length coding (VLC) the sequence.
In contrast to JPEG, JPEG2000 uses wavelet decomposition with both lossy and lossless compression enables progressive transmission by resolution (which can generate a small image from the code for the full size image), and facilitates scalable video with respect to resolution, bit-rate, color component, or position with transcoding by using Motion JPEG2000. Indeed, FIGS. 11a-11b illustrate JPEG2000 image analysis with a three-level wavelet decomposition indicated in the lower right portion of FIG. 11b. Also, Christopoulos et al, The JPEG2000 Still Image Coding System: an Overview, 46 IEEE Tran.Cons.Elect. 1103 (2000).
However, the real wavelet transforms used in JPEG2000 suffer from three shortcomings: (i) lack of shift invariance, (ii) lack of directionality, and (iii) lack of explicit phase information. Complex wavelet transforms, in which the real and imaginary parts of the transform coefficients are an approximate Hilbert-transform pair, offer solutions to these three shortcomings. This enables efficient statistical models for the coefficients that are also geometrically meaningful. Indeed, there are distinct relationships between complex coefficient magnitudes and phases, and edge orientations and positions, respectively. These relationships allows development of an effective hidden Markov tree model for the complex wavelet coefficients; for example see Choi et al, Hidden Markov Tree Modeling of Complex Wavelet Transforms, 2000 IEEE ICASSP 133. Unfortunately, the success of geometric modeling in complex wavelet coefficients has been limited to the class of redundant, or over-complete, complex transforms. This redundancy complicates any application to problems such as image/video compression for DSCs and for wireless-linked Internet transmission where parsimonious signal representations are critical.
To address the redundancy problem, Fernandes et al, A New Directional, Low-Redundancy, Complex-Wavelet Transform, 2001 IEEE ICASSP 3653 provided a low redundancy by projection and negative frequency discard. Subsequently, Fernandes introduced the Non-Redundant Complex Wavelet Transforms (NCWT); see for example, Fernandes et al, A New Framework for Complex Wavelet Transforms, 51 IEEE Trans. Signal Proc. 1825 (2003). But this implementation can be viewed as a combination of a downsampled positive-frequency projection filter with a traditional dual-band real wavelet transform. Therefore, at the finest scale, the complex wavelet transform has resolution 4x lower than the real input signal. These NCWTs do enjoy directionality and explicit phase information because of the approximate Hilbert-transform relationship between real and imaginary parts of their transform coefficients. To date, however, they have been significantly less amenable to geometric modeling than their redundant counterparts.
T. D. Tran et al, Linear-Phase Perfect Reconstruction Filter Bank: Lattice Structure, Design, and Application in Image Coding, 48 IEEE Trans. Signal Proc. 133 (2000) discloses general methods of filter bank design.