Flash Pix format specification version 1.0 has been proposed as an image format for converting natural image data into digital data suitable for computer processing.
This format specification permits a plurality of data with different resolutions to be stored together therein so that any data suited to an actual display and/or printing device can be selected and taken-out promptly in response to a user's request. Furthermore, each image is divided into tiles arranged in the format that allows the user to select only a necessary data portion of the image and process it in an enlarged or reduced size with a reduced processing load.
Referring to FIGS. 1 and 2, an image coding device for encoding an image according to the flash pix format is described as follows. In FIG. 1, images are shown in different reduced scales, each of which are divided into tiles. FIG. 2 is a block diagram of an exemplary image coding device.
The flash pix method is featured in that it generates first images 1 to 4 in sizes 1/1 to ⅛, as shown in FIG. 1, then divides each image into tiles and compresses data of each tile image.
First, a case of encoding an image 1 shown in FIG. 1 by the coding device of FIG. 2 is described. In FIG. 1, a dashed line shows the boundary between tiles.
A tile decomposition portion 11 divides an original image into tiles each comprising 64×64 pixels, which tiles are then compressed one by one by a JPEG compressor portion 12. In a coded-data integration portion 13, coded data of each tile is combined with tile decomposition information from the tile decomposition portion 11 to form coded data 1 to be output.
The image 2 of FIG. 1 is described. The original image 0 is reduced to ½ in length and width by a ½ contraction portion 14, and then the ½-size image is processed through a tile decomposition portion 15, a JPEG compressing portion 16 and a coded-data integration portion 17 to form coded data 2.
Size reduction of the image to generate a group of size-reduced images in FIG. 1 (Images 2 to 4) is repeatedly performed until a downsized image containable within a single tile is obtained. For example, the image 3 is still larger than a tile and is further contracted by a factor of 2 to obtain the image 4 allowable within a single tile as shown in FIG. 1. The size-reduction procedure is now finished.
Coded data for the image 3 is produced through a ½ contraction portion 18, a tile decomposition 19, a JPEG compressing portion 20 and a coded data integration portion 21. Coded data for the image 4 is produced through a ½ contraction portion 22, a tile decomposition portion 23, a JPEG compressing portion 24 and a coded data integration portion 25.
However, the above-described system involves the following problems: Storing coded data for images downsized with different resolutions in addition to coded data for the image with the scale 1:1 results in increasing a volume of coded data by a factor of 1.4. Furthermore, compression for encoding data must be done for each resolution image, resulting in considerably increasing processing load.
On the other hand, apart from the Flash Pix method, the image compression can be also accomplished by the wavelet transform technique whereby image data with different resolutions can be easily decoded from coded and compressed data of an original-size image. This technique is therefore free from the problem with increasing the amount of coded data.
Namely, the wavelet transform method can meet the demand for decoding data with different resolutions without any increase of coded data whereas the Flash Pix method has an increase by a factor of 1.4 in volume of coded data.
FIG. 3 is a basic block diagram of a wavelet transform coding portion wherein an original image is converted by a wavelet transform portion 31 into data for subband divisions, which data is quantized by a quantizing portion 104 and then entropy encoded by an entropy coding portion 33 to produce coded data. The wavelet transform portion 31, quantizing portion 32 and entropy coding portion 33 composes a so-called wavelet coding portion 34.
FIG. 4 is a detailed block diagram of the wavelet transform portion 31 of FIG. 3.
FIG. 5 depicts an example of the wavelet transformation of an image. FIGS. 4 and 5 are shown as an example of conducting two-dimensional subband decomposition three times.
An original image shown in FIG. 5A is filtered through a horizontal low-pass filter 41 and a horizontal high-pass filter 42 to create two horizontal subbands that are then decimated to ½ respectively by ½-subsampling portions 47 and 48.
Two horizontally divided subbands are divided each into two subbands through vertical low-pass filters 43, 45 and vertical high-pass filter 44, 46, which subbands are decimated each to ½ by ½ sampling portions 49 to 52. Consequently, four subbands are formed.
A high-horizontal and high-vertical frequency subband j (FIG. 4), a high-horizontal and low-vertical frequency subband i (FIG. 4) and a low-horizontal and high-vertical frequency subband h (FIG. 4) correspond to wavelet transform coefficients h, i and j (FIG. 5B) respectively.
After this, only a remaining low-horizontal and low-vertical frequency subband 53 is recursively divided into subbands.
This recursive subband decomposing process is performed by horizontal low-pass filters 54, 66, horizontal high-pass filters 55, 67, vertical low-pass filters 56, 58, 68, 70, vertical high-pass filters 57, 59, 69, 71 and ½-sampling portions 60–65, 72–77.
Sub-bands a–g (FIG. 4) correspond to sub-bands a–g (FIG. 5B) respectively.
Wavelet transform coefficients shown in FIG. 5B are quantized on a subband-by-subband basis by a quantizing portion 32 (FIG. 3) and then entropy encoded by an entropy coding portion 33 to produce coded data. The entropy-coding portion 33 may use Huffman coding or arithmetic coding.
On the other hand, wavelet-coded data is decoded by an entropy decoding portion 81 and inversely quantized by an inverse quantizing portion 82. Subbands are then combined by an inverse wavelet transform portion 83 to produce a decoded image. The entropy decoding portion 81, inverse quantizing portion 82 and inverse wavelet transform portion 83 compose a so-called wavelet decoding portion 84.
Image-encoding using the wavelet transform technique is featured by hierarchical structure according to resolution levels as shown in FIG. 5B. This method can easily decode images having different resolution levels from a part of coded data or a whole coded data.
Namely, an image of a quarter (¼) the original image size can be decoded by decoding subbands a, b and c. An image of a half (½) the original image size can be decoded by decoded subbands a, b, c, e, f and g. A complete ( 1/1) size image can be produced by decoding all subbands.
Referring to FIG. 7, the operation of the horizontal low-pass (H-LP), horizontal high-pass (H-HP), vertical low-pass (V-LP) and vertical high-pass (V-HP) filters shown in FIG. 4 will be described as follows. FIG. 7B is an enlarged view of an encircled part B′ of FIG. 7A.
When an output of a horizontal 9 tap filter, associated with a pixel 91 positioned right top on the original image is calculated for wavelet transformation of an original image, the operation of the filter must be performed on an area 92.
However, a part of the objective area 92 is out of the boundary of the original image, where no data exists. The vertical filters may also encounter with a similar problem.
Thus, for operation on the periphery of the image, it is often needed to use external data outside the image boundary according to the number of the taps of the filter used. Iteration of the subband decomposition also results in enlarging the area into which the filter extrudes.
In general, the above problems are treated in such a manner that the image is folded at its periphery according to a certain given rule.
For the Flash Pix method using a plurality of coded data sets separately provided for respective images of different resolution levels, the image processing load such as enlargement or contraction of the image can be reduced, but the data size is increased to 1.4 times.
For wavelet-transform coding method, data with different resolution levels can be easily decoded from a single set of compressed and coded data for an original image size and, therefore, no increase in the data size takes place.
When the wavelet-transform coding system utilizes the method of decomposing an image into tiles and encoding the image data on a tile-by-tile basis, which is used in the flash-pix system (to reduce the processing load by selectively processing only necessary tiles in case of processing a particular part of the image), however, this arises the above-described problem since filters may stick from the boundary of respective tiles.
In other words, the flash pix system using the JPEG coding can easily perform coding of each tile owing to the closed property of coding in each tile, while the wavelet-transform coding system can not effectively use the above tile-by-tile coding-and-managing method because the processing causes the extrusion of filters out of respective tiles.
In addition, the conventional wavelet-transform coding system must have a memory sufficient for storing an output of the wavelet-transform portion 31 (FIG. 3), i.e., all wavelet transform coefficients as shown in FIG. 5B. Since these coefficients have the same resolution as that of the original image, the memory has to possess a large capacity. This requirement becomes severer when processing a higher resolution image.