1. Field of the Invention
The present invention relates to a compression encoder and compression encoding method for use in image compression and expansion technology.
2. Description of the Background Art
As a next-generation high-efficiency coding standard for image data, the International Organization for Standardization (ISO) and the International Telecommunication Union-Telecommunication Standardization Sector (ITU-T) have been developing the Joint Photographic Experts Group 2000 (JPEG2000) standard. The JPEG2000 standard provides functions superior to the Joint Photographic Experts Group (JPEG) standard which is currently in the mainstream, and features the adoption of discrete wavelet transform (DWT) for orthogonal transformation and of a technique called “Embedded Block Coding with Optimized Truncation” (EBCOT) which performs bit-plane coding, for entropy coding.
FIG. 25 is a functional block diagram showing a general configuration of a compression encoder 100 for image compressing and coding based on the JPEG2000 standard. Hereinbelow, the procedure of compression and coding according to the JPEG2000 standard is generally described with reference to FIG. 25.
An image signal inputted to the compression encoder 100 is DC level shifted in a DC level shift unit 102 as needed, and outputted to a color-space conversion unit 103. The color-space conversion unit 103 converts the color space of a signal inputted from the DC level shift unit 102. For example, an RGB signal inputted to the color-space conversion unit 103 is converted into a YcbCr signal (a signal consisting of a luminance signal Y and color-difference signals Cb and Cr).
Then, a tiling unit 104 divides an image signal inputted from the color-space conversion unit 103 into a plurality of rectangular regional components called “tiles”, and outputs those components to a DWT unit 105. The DWT unit 105 performs integer or real-number DWT on each tile of an image signal inputted from the tiling unit 104 and outputs resultant transform coefficients. In DWT, a one-dimensional (1-D) filter which divides a two-dimensional (2-D) image signal into high-pass (high-frequency) and low-pass (low-frequency) components, is applied in vertical and horizontal directions in this order. In the fundamentals of the JPEG2000 standard, an octave band splitting method is adopted in which only those bandpass components (subbands) which are divided into the low frequency side in both the vertical and horizontal directions are recursively divided into further subbands. The number of recursive divisions is called the decomposition level.
FIG. 26 is a schematic view showing a 2-D image 120 subjected to the DWT with the third decomposition level by the octave band splitting method. At the first decomposition level, the 2-D image 120 is divided into four subbands HH1, HL1, LH1 and LL1 (not shown) by sequential application of the aforementioned 1-D filter in the vertical and horizontal directions. Here, “H” and “L” stand for high- and low-pass components, respectively. For example, HL1 is the subband consisting of a horizontally high-pass component H and a vertically low-pass component L of the first decomposition level. To generalize the notation, “XYn” (X and Y are either H or L; n is an integer of 1 or more) represents a subband consisting of a horizontal component X and a vertical component Y of the n-th decomposition level.
At the second decomposition level, the low-pass component LL1 is divided into subbands HH2, HL2, LH2 and LL2 (not shown). Further, at the third decomposition level, the low-pass component LL2 is divided into further subbands HH3, HL3, LH3 and LL3. An arrangement of the resultant subbands HH1, HL1, LH1, HH2, HL2, LH2, HH3, HL3, LH3 and LL3 is shown in FIG. 26. While FIG. 26 shows an example of third-order decomposition, the JPEG2000 standard generally adopts approximately third- to eighth-order decomposition.
A quantization unit 106 has the function of performing scalar quantization on transform coefficients outputted from the DWT unit 105 as needed. The quantization unit 106 also has the function of performing a bit-shift operation in which higher priority is given to the image quality of a region of interest (ROI) which is specified by a ROI unit 107. Now, in reversible (lossless) transformation, scalar quantization is not performed in the quantization unit 106. The JPEG2000 standard provides two kinds of quantization means: the scalar quantization in the quantization unit 106 and post-quantization (truncation) which will be described later.
A representative method of utilizing ROI is the Max-shift method which is specified as an optional function of JPEG2000.
The Max-shift method is to arbitrarily specify a ROI and compress the ROI with high image quality while compressing a non-ROI with low image quality. More specifically, an original image is first subjected to wavelet transform to obtain distributions of wavelet coefficients, and a value Vm of the highest wavelet coefficient in a coefficient distribution corresponding to the non-ROI among the distributions of wavelet coefficients. Then, the number of bits “S” which satisfies S>=max (Vm) is obtained, to shift wavelet coefficients for only the ROI by S bits so as to be incremented. For instance, when the value Vm is “255” in decimal notation (i.e., “11111111” in binary notation), S is 8. When the value Vm is “128” in decimal notation (i.e., “10000000” in binary notation), S is also 8. Accordingly, in either case, the wavelet coefficients for the ROI are shifted by S=8 bits so as to be incremented. It is therefore possible to set the ROI to have a lower compression ratio than the non-ROI, allowing high-quality compressed data to be acquired for the ROI.
Then, transform coefficients outputted from the quantization unit 106 are, according to the aforementioned EBCOT, entropy coded on a block-by-block basis in a coefficient bit modeling unit 108 and an arithmetic coding unit 109, and are rate controlled in a rate control unit 110. More specifically, the coefficient bit modeling unit 108 divides each subband of input transform coefficients into regions called “code blocks” of, for example, approximately size 16×16, 32×32, or 64×64 and further decomposes each code block into a plurality of bit planes each constituting a two-dimensional array of respective one bits of the transform coefficients.
FIG. 27 is a schematic view showing the 2-D image 120 decomposed into a plurality of code blocks 121. FIG. 28 is a schematic view showing n bit planes 1220 through 122n−1 (n is a natural number) constituting these code blocks 121. As shown in FIG. 28, decomposition is performed such that, where a binary value 123 representing one transform coefficient in a code block 121 is “011 . . . 0”, then bits constituting this binary value 123 belong respectively to the bit planes 122n−1, 122n−2, 122n−3, . . . , and 1220. In the figure, the bit plane 122n−1 represents the most-significant bit plane consisting only of the most-significant bits (MSB) of the transform coefficients, and the bit plane 1220 represents the least-significant bit plane consisting only of the least-significant bits (LSB) of the transform coefficients.
Then, the coefficient bit modeling unit 108 judges the context of each bit in each bit plane 122k (k=0 to n−1), and as shown in FIG. 29, decomposes the bit plane 122k according to the significance of each bit judgment result, into three types of coding passes: a significance propagation (SIG) pass, a magnitude refinement (MR) pass, and a cleanup (CL) pass. The context judgment algorithm for each coding pass is determined by the EBCOT. According to the algorithm, the state of being “significant” means that a coefficient concerned has already been found not to be zero in previous coding, and the state of being “not significant” means that the value of a coefficient concerned is or possibly zero.
The coefficient bit modeling unit 108 performs bit-plane coding with three types of coding passes: the SIG pass (coding pass for insignificant coefficients with significant neighbors), the MR pass (coding pass for significant coefficients), and the CL pass (coding pass for the remaining coefficients which belongs to neither the SIG nor MR pass). The bit-plane coding is performed, starting from the most-significant to the least-significant bit plane, by scanning each bit plane in four bits at a time and determining whether there exist significant coefficients. The number of bit planes consisting only of insignificant coefficients (0 bits) is recorded in a packet header, and actual coding starts from a bit plane where a significant coefficient first appears. The bit plane from which coding starts is coded in only the CL pass, and lower-order bit planes than that bit plane are sequentially coded in the above three types of coding passes.
FIG. 30 shows the rate-distortion (R-D) curve representing the relationship between rate (R) and distortion (D). In this R-D curve, R1 represents the rate before bit-plane coding, R2 the rate after bit-plane coding, D1 the distortion before bit-plane coding, and D2 the distortion after bit-plane coding. In the figure, A, B and C are labels representing the above coding passes. For efficient coding, as a route from the starting point P1 (R1, D1) to the end point P2 (R2, D2), the route A-B-C of a concave curve is more desirable than the route C-B-A of a convex curve. In order to achieve such a concave curve, it is known that coding should start from the MSB plane to the LSB plane.
Then, the arithmetic coding unit 109, using an MQ coder and according to the result of context judgment, performs arithmetic coding of a coefficient sequence provided from the coefficient bit modeling unit 108 on a coding-pass-by-coding-pass basis. This arithmetic coding unit 109 also has a mode of performing bypass processing in which a part of the coefficient sequence inputted from the coefficient bit modeling unit 108 is not arithmetically coded.
Then, the rate control unit 110 performs post-quantization for truncation of lower-order bit planes of a code sequence outputted from the arithmetic coding unit 109, thereby to control a final rate. A bit-stream generation unit 111 generates a bit stream by multiplexing a code sequence outputted from the rate control unit 110 and attached information (header information, layer structure, scalability information, quantization table, etc.) and outputs it as a compressed image.
The compression encoder with the aforementioned configuration adopts, as a method for compressing the amount of image data, for example a technique called rate-distortion (R-D) optimization utilizing the rate control method employed in the rate control unit 110 (cf. David S. Taubman and Michael W. Marcellin, “JPEG2000 Image Compression Fundamentals, Standards and Practice,” Kluwer Academic Publishers, which is hereinafter referred to as the “first non-patent literature”).
This method, however, arises problems of: (1) requiring calculating the amount of distortion for each rate in each coding pass, and the optimal solution for a certain coding rate has to be estimated, resulting in a great number of operations and low immediacy; and (2) requiring providing a memory for storing the amount of distortion calculated in each coding pass.
Particularly for quantization and rate control which directly affects performance characteristics of a compression encoder, there is a demand for a method for achieving efficient processing at high speeds while maintaining high image quality.