This invention pertains generally to methods and systems for compression of digital images, and more particularly to coding of a segmented image using multiple wavelet transforms.
Conventional digital images represent a visual scene using a relatively large amount of data. Visual scenes are usually digitized in a pixel grid of rows and columns, with each pixel allocated a fixed number of bits to represent gray shade or color. For example, a typical personal computer screen can display an image 1024 pixels wide, 768 pixels high, with 16 bits allocated for each pixel to display colorxe2x80x94a single such image requires over 12.5 million bits of storage. If this same screen were used to display digital video at 60 frames per second, the video would require a data rate of 755 million bits per secondxe2x80x94roughly the combined data rate of 12,000 conventional telephone conversations. Digital image technology now extends, and will continue to be extended, into applications where data volumes such as those exemplified above are undesirable, and in many instances, unworkable.
Most digital images must be compressed in order to meet transmission bandwidth and/or storage requirements. Lossless image coders generally seek out redundancies in image data (e.g., spatial, intensity, or temporal correlation) that can be coded more efficiently without loss of information content. Compression gains with lossless coders are generally modest. Lossy coders throw away part of the full precision image data during compression. Although many lossy image coders can produce images and videos compressed to only a fraction of a bit per pixel, the quality of a reconstructed lossy-compressed image at a given compression rate may vary greatly from coder to coder.
Some lossy coders transform an image before compressing it. The transform step in a coder (hopefully) allows the coder to better rank the significance of image information content. The transform coder then keeps only what it determines to be more significant transformed image information, and discards the remainder. An inverse transform later reconstructs the image from the partial transform data.
Different transforms parse image information in different ways. A discrete cosine transform (DCT) represents an image in terms of its sinusoidal spatial frequency. A discrete wavelet transform (DWT) represents an image using coefficients representing a combination of spatial location and spatial frequency. Furthermore, how well a DWT parses location and frequency information on a given image depends on the particular wavelet function employed by the DWT. For instance, the Haar wavelet function efficiently codes text and graphics regions, while the 9-7 tap Daubechies wavelet function performs well for coding natural images.
A xe2x80x9cbestxe2x80x9d wavelet transform coder can generally be selected from a set of coders for any image, given some measurable quality criteria. It has now been found that this concept can be extended to subregions of an image. The present invention is directed to transform coders capable of processing multiple image subregions, each with a different transform function. Preferably, such transform coders have the capability to process arbitrary-shaped image subregions. For purposes of this disclosure, an image subregion is synonymous with an image xe2x80x9csegmentxe2x80x9d.
Prior subregion coders are limited by several constraints that the present invention seeks to overcome. First, existing subregion coders require that each image subregion form a rectangle. Second, prior subregion coders code each subregion separately. Third, prior subregion coders do not lend themselves well to embedded coding techniques.
The present invention utilizes the arbitrary shape wavelet transform (ASWT), which allows an image to be divided into arbitrarily-shaped subregions. The subregions are wavelet-transformed separately with a xe2x80x9cbestxe2x80x9d wavelet filter from a finite set of filters. Then, the wavelet transforms of the image segments are combined in a coherent manner prior to coding. This combination step allows the coder to xe2x80x9coptimallyxe2x80x9d allocate bits between subregions, each having been xe2x80x9coptimallyxe2x80x9d transformed.
In one aspect of the present invention, a method for wavelet transform coding of a segmented digital image is disclosed. The method comprises applying a first wavelet transform filter at a given wavelet decomposition level to a first segment of an image, thereby obtaining a first set of transform coefficients. A second wavelet transform filter is applied at the same wavelet decomposition level to a second segment of an image to obtain a second set of transform coefficients. The first and second sets of transform coefficients are then merged to form a composite wavelet coefficient image. The composite wavelet coefficient image may then be coded with any conventional wavelet transform coderxe2x80x94implicitly, the coder will jointly allocate bits to each segment optimally, through bit allocation on the composite wavelet coefficient image. This method may be extended to include additional wavelet transform filters and finer image segmentation.
Preferably, the wavelet transform set utilizes the arbitrary shape wavelet transform (ASWT), which can process image segments in almost any shape. The present invention also allows joint bit allocation with embedded coding.
The present invention performs two types of image segmentation. In the first type, segmentation decisions are input to the process externally. In the second type, segmentation is coupled with filter assignment, such that segmentation in at least some sense tracks optimal spatial filter assignment.
In another aspect of the invention, a digital image coder is disclosed. This system comprises an image segmentor, a wavelet filter bank, a composite wavelet coefficient mapper, and, preferably, a transform coder. The image segmentor parses segments of an input image to filters from the wavelet filter bank. Each wavelet filter computes wavelet transform coefficients for its image segment. The composite wavelet coefficient mapper gathers the wavelet coefficients produced by each wavelet filter into a composite coefficient image, arranging them as if they were produced from a single wavelet transform. Finally, the transform coder codes the composite coefficient image.