1. Field of the Invention
The present invention relates to a code processing device and method which generates, from original codes of an image, output codes having a quality according to a specified quality value.
2. Description of the Related Art
In recent years, adoption of the wavelet transform is increasing as the frequency transform which replaces DCT (discrete cosine transformation) that is used in the JPEG (Joint Photographic Coding Experts Group) standard. A typical example is the image compression/decompression method JPEG2000 which became the international standard in 2001 as the follow on to the JPEG standard.
FIG. 1 shows a simplified algorithm of the coding processing of JPEG2000.
At step S1, each component of image data that is to be compressed and encoded is divided into rectangular tiles (the number of tiles ≧1), and color space conversion of each tile is carried out to generate components, such as brightness and color difference. Color space conversion from RGB data or CMY data into YCrCb data is performed on the tiles at step S1 in order to improve the compression ratio. However, this color space conversion may be omitted.
At step S2, each of the components after color space conversion (called a tile component) is divided into four subbands (which are called LL, HL, LH, and HH) by carrying out a wavelet transform.
Applying the wavelet transform (decomposition) is recursively repeated to the LL subband, and finally one LL subband and plural subbands HL, LH and HH will be generated.
In JPEG2000, the irreversible wavelet transform called the 9×7 transform and the reversible wavelet transform called the 5×3 transform are used. And the irreversible color transform ICT for the 9×7 transform and the reversible color transform RCT for the 5×3 transform are specified for the color space conversion.
At this step S2, two-dimensional wavelet transform (disperse wavelet transform DWT) is performed on each tile image of each component after the color space conversion.
The two-dimensional wavelet transform is performed on a tile image having the decomposition level of zero so that the tile image can be divided into sub-bands 1LL, 1HL, 1LH and 1HH. The two-dimensional wavelet transform is applied to coefficients of the sub-band 1LL so that the sub-band 1LL can be divided into sub-bands 2LL, 2HL, 2LH and 2HH. The two-dimensional wavelet transformation is applied to a coefficient of the sub-band 2LL so that the sub-band 2LL can be divided into sub-bands 3LL, 3HL, 3LH and 3HH.
At step S3, the wavelet coefficients obtained by such octave division of the low frequency component (coefficient of the sub-band LL) is quantized for each sub-band. In JPEG2000, it is possible to perform either lossless (reversible) compression or lossy (irreversible) compression. Further, in JPEG2000, in the case of the lossless compression, the quantizing step width is always “1” so that the quantizing is not performed.
After the quantizing is performed, the coefficient of each sub-band is entropy-encoded at step S4. For this entropy encoding, an encoding method called EBCOT (Embedded Block Coding with Optimized Truncation) is used that includes block division, coefficient modeling, and binary arithmetic encoding. In the entropy encoding, a bit-plane of the coefficient of each sub-band after the quantizing is encoded from an upper plane to a lower plane by the block called a code block.
The last two steps S5 and S6 are the code forming processes. At step S5, the codes created at step S4 are collected to form a packet. Next, at step S6, the packets created at step S5 are arranged in accordance with a progression order, and necessary tag information is added to the arranged packets in order to form encoded data in a predetermined format. In JPEG2000, five types of progression orders based on combination of a resolution level, a position, a layer, and a component (color component) are defined for encoding order control.
According to the JPEG2000 algorithm, image quality is good at a high compression rate (or a low bit rate). Such a JPEG2000 algorithm has many characteristics.
As one characteristic of the JPEG2000 algorithm, it is possible to adjust an entire code amount without performing recompression by performing post-quantization such that the code of the encoded data is truncated (truncated). This code truncation can be performed in various units such as a tile region, a component, a decomposition level (resolution level), a bit plane, a sub-bit plane, a packet, and a layer (in the case of multi-layers). The relation between the decomposition level and the resolution level is shown in FIG. 3. As for another characteristic of the JPEG algorithm, encoded data can be easily divided into two or more encoded data in the encoded state, and these encoded data can be combined to restore the original encoded data. As for still another characteristic of the JPEG2000 algorithm, by only rewriting (changing) tag information of encoded data, decoding can be performed as if a part of the code line is actually truncated.
Next, a precinct, a code block, a packet and a layer will be briefly described. There is the following size relation: image size≧tile size≧sub-band size≧precinct size≧code block size.
A precinct refers to a rectangular region of the sub-bands. As shown in FIG. 2, a set of three regions that are located at the spatially same positions of the sub-bands HL, LH and HH having the same decomposition level is treated as one precinct. However, as for the sub-band LL, one region is treated as one precinct. The size of the precinct can be made equal to the size of the sub-band. A code block is a rectangular region that is created by dividing the precinct. For simplicity, the precinct and one code block at the decomposition level 1 are shown in FIG. 3.
A packet is created by collecting partial codes of the all code blocks included in the precinct (for example, the codes of three bit planes from the first bit plane to the third bit plane). A packet having a vacant code can be allowed. Thus, the codes of the code blocks are collected to create packets, and the packets are arranged in accordance with a desired progression order to form encoded data.
By collecting the packets of the all precincts (i.e., all code blocks, all sub-bands), partial codes of the entire image region (for example, this part corresponds to the codes of the uppermost bit plane to the third bit plane of the wavelet coefficients of the entire image region) are created as a layer. However, as shown in the following example, the layer does not necessarily have to include the packets of all precincts. Accordingly, when a larger number of layers are decoded at the time of decompression, the image quality of the reproduced image is more improved. In other words, a layer can be considered as a unit of image quality. By collecting all layers, the codes of all bit planes of the entire image region are made.
FIG. 4 shows examples of packets and layers in a case where the decomposition level number is 2 (the resolution level number is 3). The upper example of FIG. 4 is the example of layers in the case where the decomposition level number is 2 and the precinct size equal the sub-band size. In the lower example of FIG. 4, some of the packets included in each layer are enclosed by the thick lines.
In the following, the operation to arrange packets in a sequence according to the partitions of the packets or layers being generated will be called code formation. A packet is composed of partial codes of the original codes of JPEG2000, and has the following four attributes: component attribute, resolution attribute, precinct attribute, and layer attribute. The arrangement of packets is equivalent to selecting the order according to any of the four attributes as the sequence in which the packets are arranged.
JPEG2000 has another characteristic in which a progression order of encoded data can be changed in the encoded state. In JPEG2000, five progression orders LRCP, RLCP, RPCL, PCRL and CPRL are defined where L denotes a layer, R denotes a resolution level, G denotes a component, and P denotes a precinct (position).
In a case of the LRCP progression, the packet arrangement order (at the time of encoding) or the packet interpretation order (at the time of decoding) is represented as the following for-loop nested in the order of L, R, C, P.
for ( layer ) {for (resolution) {for (component) {for (precinct) {arranging packets when encodinginterpreting packet attributes when decoding}}}} .
In a specific example, the image size is 100×100 pixels (without tile dividing), the number of layers is 2, the resolution level is 3 (levels 0 through 2), and the component number is 3, a precinct size is 32×32. In this example, 36 packets are arranged and interpreted in the manner shown in FIG. 7.
Also in cases of the other progression orders, a packet arranging order or a packet interpreting order can be determined by the nested for-loop.
Each packet has a packet header, and the following information is contained in a packet header:
information as to whether the content of the packet is empty or not,
information as to which code block is included in the packet,
the number of zero bit planes of each code block included in the packet,
the number of coding paths of codes of each code block contained in the packet (the number of bit planes),
the code length of each code block included in the packet.
However, a packet header does not contain any attribute value, such as a layer number and a resolution level. At the time of decoding, the above-mentioned for-loop is formed based on the progression order indicated in the COD marker in the main header of the encoded data (codes). The partition of a packet is determined based on the sum of the code lengths of respective code blocks included in the packet. What specific resolution level and what specific layer each packet pertains can be recognized by detecting the position in the for-loop where each packet is handled.
This means that if the code lengths in the packet header of the current packet are read, the following packet can be detected without decoding the entropy codes themselves, i.e., arbitrary packets can be accessed.
FIG. 6 shows an example of the layer progressive codes in which layer is located in the outermost part of the for-loop as in the LRCP progression. In FIG. 6, SOC denotes the start of codestream, SOT denotes the start of tile-part, SOD denotes the start of data, and EOC denotes the end of codestream.
FIG. 7 shows an example of an array of 36 packets in the case of the LRCP progression in which the image size is 100×100 pixels, the number of layers is two, the number of resolution levels is 3 (0-2), the number of components is three, and the precinct size is 32×32.
As mentioned above, accessing the original codes of JPEG2000 per packet is possible, and this means that new codes can be generated by extracting from the original codes, necessary partial codes (i.e., packets). This also means that the original codes can be decoded partially if needed. For example, when an image of a large size from the server is displayed on the client side, it is possible that only partial codes with a desired quality, partial codes with a desired resolution, partial codes of a desired location, or partial codes with a desired component be received from the server and decoded on the client side.
Thus, the protocol JPIP (JPEG2000 Interactive Protocol) for receiving the necessary partial codes only from the original codes of JPEG2000 stored in the server is currently in the progress of standardization.
Japanese Laid-Open Patent Application No. 2003-023630 discloses the cache model in JPIP.
Similarly, the conventional protocols for hierarchical access of a partial image can be found in FlashPix (registered trademark) which is multiplex resolution expression of an image and can be found in the protocol IIP (Internet Imaging Protocol) for accessing the same. Japanese Laid-Open Patent Application No. 11-205786 discloses a background technology related to similar protocols for hierarchical access of a partial image.
In addition, according to JPEG2000, a visual weight is defined as an example of a visual frequency characteristic of each component, and three types of visual weight of observation distance for each component: 1000, 1700, and 3000 are described. Japanese Laid-Open Patent Application No. 2001-298366 discloses that selecting a visual weight according to a compression ratio or a quantization error of original codes allows formation of an image with good quality.
In the case of JPIP, specifying a desired area of an image for image drawing and a desired image quality from a client to a server is proposed. When the desired area is specified, the server transmits to the client the packets of precincts that cover the specified area in the image. Namely, when the desired area is specified, only the packets of necessary precincts can be transmitted by performing suitably the related part of the for-loop: for (precinct) {interpreting packet attributes when decoding}. Also, when the desired quality of image is specified simultaneously with the area, only the packets of necessary layers can be transmitted by performing suitably the related part of the for-loop: for (layer) { . . . for (precinct) {interpreting packet attributes when decoding} . . . }.
Currently, enabling the specification of a quality value in a range of 0 to 100 as a quality specification (quality request) is proposed. It can be assumed according to the proposal that, if the specified quality value is set to a large value, the number of layers of the partial codes is transmitted can be increased. For example, suppose that the original codes stored in the server comprise 50 layers, and if a specified quality value is 50, the partial codes for 25 layers can be transmitted. Namely, the related part of the for-loop: for (layer=0 to 24) { . . . } is carried out and the resulting packets can be transmitted. This is because a layer is a unit of quality of an image.
However, there may be a case in which the original codes in the server comprise one layer. It is uncertain as to which packets should be transmitted if a specified quality value is 50 in this case. There is only one quality unit for one layer.
Similarly, there may be a case in which the original codes in the server comprise three layers. If a specified quality value is 50 in this case, the packets of the highest rank layer should be transmitted. However, it is uncertain as to whether the packets of the second rank layer should be transmitted or not.