In recent years, as the technologies for image input apparatuses such as digital cameras, scanners, and the like have improved, the resolution of image data captured by such input apparatus is increasing. A low-resolution image requires a small image data volume and never disturbs coding, transfer, and storage processes. However, the image data volume becomes huge with increasing resolution, and a long transfer time is often required. Hence, the coding and storage processes require a large storage size.
As a method of efficiently transferring and displaying such large-size image data, scalable transfer of image data has received a lot of attention. In this method, data are transferred in turn from an image having low image quality, so that the receiving side can recognize an outline of an image in an early stage of image data transfer, and data required to reclaim an image with higher image quality are transferred in turn. As a result, the quality of an image reclaimed at the data receiving side improves gradually.
As a coding method suitable for such scalable transfer, a coding method that implements spatial scalability using wavelet transformation as sequence transformation, and implements SNR scalability using bit plane coding as entropy coding has been studied.
FIG. 2 shows an example of an image encoding apparatus using the aforementioned coding method. Referring to FIG. 2, reference numeral 201 denotes an image input unit; 202, a discrete wavelet transformer; 203, a coefficient quantizer; 204, a bit plane encoder; 205, a code sequence forming unit; and 206, a code output unit.
The operation in the image encoding apparatus shown in FIG. 2 will be explained below. Pixel data P(x, y) that form an image to be encoded in the raster scan order are input to the image input unit 201. x and y indicate the horizontal and vertical positions of a pixel. The image input unit 201 comprises a storage device such as a hard disk, magnetooptical disk, memory, or the like, which stores image data, an image sensing device such as a scanner or the like, an interface for a network line, or the like.
The discrete wavelet transformer 202 computes the two-dimensional discrete wavelet transforms of pixel data P(x, y) input from the image input unit 201 while storing them in its internal buffer (not shown) as needed, and decomposes them into seven subbands LL, LH1, HL1, HH1, LH2, HL2, and HH2. The transformer 202 then outputs coefficients of respective subbands. Let C(S, x, y) be the coefficient of each subband. Note that S represents one of subbands LL, LH1, HL1, HH1, LH2, HL2, and HH2. Also, x and y indicate the horizontal and vertical coefficient positions if (0, 0) represents the position of a coefficient at the upper left corner in each subband.
Two-dimensional discrete wavelet transformation is implemented by applying one-dimensional transformation in the horizontal and vertical directions. FIGS. 4A to 4C show processes in which an image to be encoded (FIG. 4A) undergoes one-dimensional discrete wavelet transformation in the vertical direction so as to be decomposed into low- and high-frequency subbands L and H (FIG. 4B), and these subbands further respectively undergo one-dimensional discrete wavelet transformation in the horizontal direction to be decomposed into four subbands LL, HL, LH, and HH (FIG. 4C). In this image encoding apparatus, one-dimensional discrete wavelet transformation of N one-dimensional signals x(n) (n=0 to N−1) is described by:h(n)=x(2n+1)−floor{(x(2n)+x(2n+2))/2}l(n)=x(2n)+floor{(h(n−1)+h(n)+2)/4}where h(n) is a coefficient of a high-frequency subband, l(n) is a coefficient of a low-frequency subband, and floor{R} is a maximum integer smaller than real number R. Note that the coefficients h(n) are computed within the range of n=0 to floor{N/2} and coefficient l(n) are computed within the range of n 0 to floor{(N+1)/2}. Also, the two ends x(n) (n<0 and n≧N) of the one-dimensional signals x(n) required upon computing the above equations are calculated in advance from the values of one-dimensional signals x(n) (0≦n<N) by a known method.
By repetitively computing the two-dimensional discrete wavelet transforms of the subband LL obtained by the aforementioned two-dimensional discrete wavelet transformation, the subband LL is decomposed into seven LL, LH1, HL1, HH1, LH2, HL2, and HH2, as shown in FIG. 5. Note that LL in FIG. 5 is obtained by re-decomposing LL in FIG. 4C, and is not the same as LL in FIG. 4C.
The coefficient quantizer 203 quantizes coefficients C(S, x, y) of the respective subbands generated by the discrete wavelet transformer 202 using quantization steps delta(S) determined for respective subbands. If Q(S, x, y) represents the quantized coefficient value in subband S, the quantization process done by the coefficient quantizer 203 is described by:Q(S, x, y)=sign{C(S, x, y)}×floor{|C(S, x, y)|/delta(S)}where sign{I} is a function which returns the positive/negative sign of integer I; 1 if I is positive or −1 if I is negative. Also, floor{R} is a maximum integer smaller than real number R.
The bit plane encoder 204 encodes the coefficient values Q(S, x, y) quantized by the coefficient quantizer 203 to generate a code sequence. A method of breaking up the coefficients of each subband into blocks, and encoding them individually to facilitate random access is known. However, encoding is done herein in units of subbands for the sake of simplicity. The quantized coefficients Q(S, x, y) (to be simply referred to as coefficient values hereinafter) of respective subbands are encoded by expressing the absolute values of the coefficient values Q(S, x, y) in each subband by natural binary values, and making binary arithmetic coding of them from the upper to the lower bits, giving priority to the bit plane direction. Where coefficient Q(S, x, y) of each subbands is expressed by natural binary, n-th bit from lowest bit of coefficient is expressed as Qn(x, y). Note that a variable n indicating a bit of a binary value is called a bit plane number, and bit plane number n represents the LSB as the 0th bit.
FIG. 6 shows the flow of a process for encoding subband S by the bit plane encoder 204.
Referring to FIG. 6, step S601 is a step of computing a maximum value Mabs(S) of the absolute values of the coefficients in subband S, step S602 is a step of computing the number NBP(S) of effective bits required to express the maximum value Mabs(S), step S603 is a step of substituting the number of effective bits in the variable n, step S604 is a step of computing (n−1) and substituting it in n, step S605 is a step of encoding an n-th bit plane, and step S606 is a step of checking if n=0. The processes in the respective steps will be described in detail below.
In step S601, the absolute values of the coefficients in subband S to be encoded are checked to obtain their maximum value Mabs(S).
In step S602, the number NBP(S) of bits required to express Mabs(S) by a binary value is computed by:NBP(S)=ceil{log2(Mabs(S))}where ceil{R} is a minimum integer equal to or larger than real number R. In step S603, the number NBP(S) of effective bits is substituted in bit plane number n. In step S604, 1 is subtracted from bit plane number n. In step S605, bit plane n is encoded using binary arithmetic coding. Note that QM-Coder is used as arithmetic coding. Since the sequence for encoding binary symbols generated in given state (context) S using this QM-Coder or the initialization and termination sequences for arithmetic coding have been explained in detail in ITU-T Recommendation T.81|ISO/IEC10918-1 recommendation, and the like as the international standards for still images, a description thereof will be omitted. At the beginning of encoding of each bit plane, the internal arithmetic encoder (not shown) of the bit plane encoder 204 is initialized. Or a termination process of the arithmetic encoder is done upon completion of encoding. Immediately after the first ‘1’ to be encoded of each coefficient, the positive/negative sign of that coefficient is expressed by 0 or 1 and that coefficient is undergone arithmetic coding. If the coefficient is positive, 0 is output; if the coefficient is negative, 1 is output. For example, if a coefficient is −5, and the number NBP(S) of effective bits of subband S to which this coefficient belongs is 6, the absolute value of this coefficient is expressed by a binary value 000101, and is encoded from the MSB to the LSB upon encoding respective bit planes. Upon encoding the second bit plane (the fourth bit from the MSB), the first ‘1’ is encoded, and the positive/negative sign ‘1’ is encoded by arithmetic coding immediately thereafter.
In step S606, bit plane number n is compared with 0. If n=0, i.e., if the LSB plane is encoded in step S605, the encoding process of the subband ends; otherwise, the flow returns to step S604.
With the aforementioned process, all coefficients of subband S are encoded to generate code sequences CS(S, n) corresponding to bit planes n. The generated code sequences are sent to the code sequence forming unit 205 and are temporarily stored in the internal buffer (not shown) of the code sequence forming unit 205.
When encoding of the coefficients of all the subbands by the bit plane encoder 204 is complete, and all the code sequences are stored in the internal buffer, the code sequence forming unit 205 reads out the code sequences stored in the internal buffer in a predetermined order, inserts required additional information, and forms a final code sequence as the output of this encoding apparatus. The unit 205 then outputs the code sequence to the code output unit 206.
The final code sequence generated by the code sequence forming unit 205 consists of a header, and encoded data stratified into three levels, i.e., levels 0, 1, and 2. The encoded data of level 0 is comprised of code sequences CS(LL, NBP(LL)−1) to CS (LL, 0) obtained by encoding the coefficients of the subband LL. The encoded data of level 1 is comprised of code sequences CS(LH1, NBP(LH1)−1) to CS(LH1, 0), CS(HL1, NBP(HL1)−1) to CS(HL1, 0), and CS(HH1, NBP(HH1)−1) to CS(HH1, 0) obtained by encoding the coefficients of the subbands LH1, HL1, and HH1. The encoded data of level 2 is comprised of code sequences CS(LH2, NBP(LH2)−1) to CS(LH2, 0), CS(HL2, NBP(HL2)−1) to CS(HL2, 0), and CS(HH2, NBP(HH2)−1) to CS (HH2, 0) obtained by encoding the coefficients of the subbands LH2, HL2, and HH2.
FIG. 3 shows the structure of the code sequence generated by the code sequence forming unit 205. Note that this encoded data has undergone a process for inserting information in header information or a marker, so that given partial data can be accessed, e.g., the number NBP(LH1) of effective bits of the subband LH1 can be read out from the encoded data shown in FIG. 3.
The code output unit 206 externally outputs the code sequence generated by the code sequence forming unit 205. The code output unit 206 comprises, e.g., a storage device such as a hard disk, memory, or the like, an interface for a network line, or the like.
However, some problems are experienced in an image decoding apparatus that decodes encoded data generated by the conventional scalable encoding method mentioned above.
Since bit plane coding is used as entropy coding, as the coefficients to be encoded are decoded in units of subbands or blocks obtained by breaking up subbands into a given size, a large memory for storing the coefficients is required.
Furthermore, when arithmetic coding is used upon encoding binary information of each bit plane as in the above prior art, a complicated arithmetic process is required to decode arithmetic codes, resulting in larger CPU power required, a long processing time, a large circuit scale, and the like.
In order to decode images in the raster scan order from the code sequences generated by the conventional scalable coding method, many code sequences must be temporarily stored, thus requiring a larger memory size.
Since recent personal computers have gained higher performance and functions, the aforementioned problems are eliminated. However, the aforementioned problems are serious in apparatuses with limited arithmetic performance and memory size such as a printer, portable terminal, and the like.
The present invention has been made in consideration of the aforementioned problems, and has as its object to achieve efficient encoding/decoding even when the processing time, memory, arithmetic cost, and the like of an apparatus are limited.