The present invention relates generally to the processing, compression, communication and storage of images in computer systems, personal digital assistants, digital cameras and other devices, and particularly to an image management system and method in which digitally encoded images can be viewed in any specified window size and at a number of resolutions, and can be printed, cropped, and otherwise manipulated.
An image may be stored at a number of resolution levels. The encoded image data for a lower resolution level is smaller, and thus takes less bandwidth to communicate and less memory to store than the data for a higher resolution level. When an image is stored for multi-resolution use, it would be desirable for the image data to be segregated into an ordered group of sets or subfiles, where each additional subfile provides the additional data needed to increase the resolution of the image from one level to the next. Further, it would be desirable for the quantity of image data in each subfile, for increasing the image resolution by a particular factor (such as 4), to be approximately proportional to the associated increase in resolution. For instance, if each resolution level differs from its neighboring resolution levels by a factor of 4 (e.g., level 0: 32xc3x9732, level 1: 64xc3x9764, level 2: 128xc3x97128, and so on), then the quantity of encoded image data for each resolution level should be approximately 25% as much as the quantity of encoded image data for the next higher resolution level. From another viewpoint, the quantity of data in the subfile(s) used to increase the image resolution from a first level to the next should, ideally, be approximately three times as much as the quantity of data in the subfile(s) for the first level.
It is well known that wavelet compression of images automatically generates several resolution levels. In particular, if N xe2x80x9clayersxe2x80x9d of wavelet transforms are applied to an image, then N+1 resolution levels of data are generated, with the last LL subband of data comprising the lowest resolution level and all the subbands of data together forming the highest resolution level. For convenience, the xe2x80x9clayersxe2x80x9d of wavelet transforms will sometimes be called xe2x80x9clevelsxe2x80x9d. Each of these resolution levels differs from its neighbors by a factor of two in each spatial dimension. We may label these resolution levels as Level 0 for the lowest, thumbnail level to Level N for the highest resolution level, which is the resolution of the final or base image.
A first aspect of the present invention is based on two observations. The first such observation is that, when using conventional as well as most proprietary data compression and encoding methods, the quantity of data in the N levels generated by wavelet compression tends to decrease in a geometric progression. For instance, the quantity of data for resolution Level 0 is typically about 80% of the quantity of data for resolution Level 1, whereas ideally it should about 25% of the quantity of data for resolution Level 1. As a result, the data for Level 0 contains significantly more data than is needed to display the Level 0 image. Alternately stated, the data for Level 0 gives unnecessarily high quality for the low resolution display at Level 0, and therefore gives less compression than could potentially be obtained by providing only the information needed for displaying the image at the Level 0 resolution level.
The second observation is that the low resolution image data coefficients are quantized for full resolution display, not for low resolution display, because these data coefficients are used not only for generating a low resolution representation of the image, but are also used when generating the higher resolution representations of the image.
In accordance with this first aspect of the present invention, as already indicated above, it would be desirable for the quantity of image data in the subarray or subfile for each resolution level to be approximately proportional to the increase in resolution associated with that resolution level.
A second aspect of the present invention is based on the observation that wavelet transforms are conventionally applied across tile or block boundaries of an image to avoid tile or block boundary artifacts in the regenerated image. A wavelet transform may be implemented as a FIR (finite impulse response) filter having an associated length. The xe2x80x9clengthxe2x80x9d indicates the number of data samples that are used to generate each coefficient. Wavelet transforms are generally symmetric about their center, and when the filter that implements the wavelet transform is at the edge of a tile or block, typically half or almost half of the filter will extend into a neighboring block or tile. As a result it is usually necessary to keep not only part of the neighboring tiles in memory while wavelet transforming a tile of an image, it also necessary to keep in memory the edge coefficients of the neighboring tiles for each level of the wavelet transform. Thus, avoiding tiling effects (also called tile border effects or artifacts or edge artifacts) typically increases the memory requirements of the computer or device performing the wavelet transforms on an image, and may also increase the complexity of the transform procedure because of the need to keep track of the memory locations of edge data and coefficients from the neighboring tiles or blocks. In accordance with the second aspect of the present invention, it would be highly desirable to have a wavelet or wavelet-like transform that can be applied to just the data for the image block being processed, without having to also apply the transform to data from neighboring blocks, and without creating noticeable edge artifacts. Having such a transform would decrease memory requirements and might simplify the wavelet compression of images.
It is well known in the prior art that digital images can be processed a portion at a time, instead of all at once, thereby reducing memory requirements. For instance, the DCT transform used for JPEG compression and encoding of images is traditionally used on tiles of 8xc3x978 pixels. However, a well known problem with tiling an image for processing is that the tiling produces undesirable tile border effects. The border effects of DCT tiling in JPEG images are considered to be acceptable because the very small size of the tiles makes the tiling effect relatively unnoticeable to the human eye.
However, using very small tiles such as 8xc3x978 pixels is not practical when using wavelet or wavelet-like transforms in place of the DCT transform. Wavelet-like transforms have been shown to provide significantly better data compression than the DCT transform, and therefore using wavelet-like transforms would be desirable if the tiling effect can be avoided while using a moderate amount of working memory.
It would therefore be desirable to provide an image processing system and method that process images using a moderate amount of working memory, such as 8 to 20 KB, by transforming the image data using a wavelet-like transform with moderately sized tiles, such as tiles of 64xc3x9764, or 32xc3x9732, or 64xc3x9732 pixels, while at the same time avoiding the generation of undesirable tiling (tile border) effects.
A third aspect of the present invention is based on the observation that the optimal quantization level to be applied to wavelet coefficients not only varies from one transform subband to another, but also varies from one region of an image to another. In particular, regions of an image that contain many xe2x80x9cfeaturesxe2x80x9d (typically characterized by horizontal or vertical lines or edges) are harder to compress than regions with fewer features. That is, such densely featured image regions cannot be compressed as much as less densely featured regions without causing degradation in the quality of the image regions regenerated from the compressed data. It would therefore be desirable to provide an image compression and encoding system with a quantization procedure that uses smaller quantization divisors to quantize the wavelet coefficients of heavily featured regions than the quantization divisors used to quantize the wavelet coefficients of regions having fewer features.
In summary, the present invention is an image processing system and method for applying a family of predefined transforms, such as wavelet-like transforms, to the image data for an image so as to generate transform image data and for applying a data compression method to the transform image data so as to generate an image file. The image processing system and method tiles a captured image, processing the tiles in a predefined order. The tiles are nonoverlapping portions of the image data. Each tile of image data is processed by applying a predefined family of transform layers to the tile of image data so as to generate successive sets of transform coefficients. In a preferred embodiment, the transform layers are successive applications of a family of wavelet-like decomposition transforms, including edge filters applied to data at the boundaries of the data arrays being processed and interior filters applied to data in the interior regions of the data arrays.
The set of transform coefficients processed by each transform layer include edge coefficients positioned at outside boundaries of the set of transform coefficients and non-edge coefficients positioned at interior locations of the set of transform coefficients. The sets of transform coefficients include a last set of transform coefficients, produced by the last transform layer, and one or more earlier sets of transform coefficients.
The transform filters used include one or more edge transform filters applied to image data at boundaries of the tile and to coefficients positioned at and near boundaries of each of the earlier sets of transform coefficients so as to generate the edge coefficients, and one or more interior filters applied to image data at interior locations of the tile and to coefficients at interior locations of the earlier sets of transform coefficients so as to generate the non-edge coefficients. The edge transform filters have shorter filter supports than the interior transform filters, and both the edge transform filters and the longer interior transform filters are applied only to image data within the tile and only to transform coefficients within the earlier sets of transform.
The edge filters include a short, low spatial frequency filter that weights the image datum closest to the boundary of the tile and the transform coefficient closest to the boundary of each earlier set of transform coefficients so as to as enable regeneration of the image from the transform coefficients without tile boundary artifacts.
At least some of the transform filters are preferably asymmetric boundary filters, extending to a first extent toward each tile""s boundary, and extending to a second, longer extent in a direction away from the tile""s boundary, but not extending over the tile""s boundary.
In a preferred embodiment, the interior transform filters include a center filter, for generating two to four high pass and two to four low pass coefficients at or near the center of the data array being processed. The center filter acts as a filter switch. Two distinct forms of the interior filter are used on alternate sides of the center filter. For instance, the interior filter may be centered on even data positions on one side of the center filter and centered on odd data positions on the other side of the center filter.
The image processing system and method may also include image reconstruction circuitry or procedures for successively applying a data decompression method and an inverse transform to the image file so as to generate a reconstructed image suitable for display on an image viewer.
In a second aspect of the present invention, the sets of transform coefficients correspond to spatial frequency subbands of the image. The subbands are grouped in accordance with the transform layer that generated them. For one or more respective groups of subbands, for each tile of the image, one or more parameters are generated whose value is indicative of the density of image features in the tile. Each tile of the image is classified into one of a predefined set of categories in accordance with the values of the one or more parameters. Based on the classification of the tile, a set of quantization factors for the tile are selected, and then the transform coefficients of the tile are scaled by the selected set of quantization factors to as to generate a set of quantized transform coefficients for the tile.
In a third aspect of the present invention the quantized transform coefficients are encoded. While the coefficients for each group of spatial frequency subbands are being encoded, a most significant set of bit planes of those coefficients are stored in a first bitstream and the remaining least significant set of bit planes of the coefficients are stored in a second bitstream. From another viewpoint, the portions of the encoded coefficients (for a group of subbands) whose value exceeds a predefined threshold are stored in a first bitstream while the remaining portion of the encoded coefficients are stored in a second bitstream. When reconstructing an image from the image file at a specified resolution level, only the bitstreams corresponding to the specified resolution level are decoded and used to reconstruct the image. For some resolution levels, one or more of the bitstreams not used will contain the least significant portions (i.e., bit planes) of subbands whose more significant portions are contained in the bitstreams used to reconstruct the image at that resolution level.