Efficient transmission of a composite digital image may be accomplished by decomposing the image into different layers with different characteristics and then encoding each layer using a different method. As one example, an image may be separated into text and background layers that are then independently transformed and encoded.
One method of decomposing an image involves masking out the unwanted elements associated with a layer using a binary mask and transforming the remaining elements so that the masked elements do not need to be transmitted. Both the Discrete Cosine Transform (DCT) and the Discrete Wavelet Transform (DWT) have been adapted to support the transformation of arbitrarily-shaped images covered by a binary mask. However, the DWT generally result in higher image quality once an image has been decoded.
When wavelets are located entirely under a mask, coefficients resulting from a DWT transform may be set to zero or skipped during the coding process and the image may still be fully recovered at a remote system. In this case, the shape-adaptive DWT (SA-DWT) method enables the transform of an arbitrarily-shaped contiguous region that preserves self-similarity across sub-bands and the number of pixels in the original shape.
However in many image types, especially images comprising text, the image area is only partially masked. This includes image areas associated with the smallest scale wavelet. In these situations, partially masked wavelet coefficients cannot simply be cancelled or skipped as this will affect the unmasked pixel values in the recovered image. Rather, the input image or transformed coefficients are adjusted to compensate for the masked pixels.
One approach involves filling masked areas of the image with hallucinated data such as the average value of adjacent pixels. However, hallucination methods increase the image size by adding more data. They also introduce discontinuities at the mask boundaries that blurs the output image.
Another method involves adjusting the masked coefficients by calculating a code efficient interpolation of visible pixels using methods such as Projection Onto Convex Sets (POCS). POCS generates a set of transform solutions and a sequence of data comparisons to determine the optimum transform. This method involves high computational effort. For example, the Déjà vu system was documented to takes 15 seconds on an SGI workstation. This high computation effort is inappropriate for real time image processing applications such as compression of image frames comprising rapidly-changing content.
Another approach is mask-dependent lifting. In this method, the availability of pixels for an input image as limited by the mask is evaluated. An appropriate polynomial wavelet function is then derived on a case-by-case basis and a linear combination of available unmasked neighboring pixels for both prediction and update steps of the lifting process is generated. When the lifting method is used for masked sets, the update coefficients are derived from modified predicted coefficients. A problem with this approach is that modified update values are used for each subsequent pass of the wavelet transform that increases the uncertainty in the low frequency components as the transform progresses.
In summary, existing methods for transforming partially masked images are not optimized to address requirements necessary for efficient transformation of high-accuracy real-time images.