1. Field of the Invention
The present invention generally relates to a method for compressing image data such as data of a natural image and, more particularly, to a method for encoding image data such as a static image or a dynamic image by using a wavelet transform.
2. Description of the Related Art
Conventionally, a discrete cosine transform (DCT) is popular for compressing natural image data such as static image data or dynamic image data. However, when encoding image data by using the DCT, a problem arises in achieving a high compression ratio since there is an inherent deterioration in subjective image quality caused by generation of a block distortion or a mosquito noise. In order to eliminate such a problem, various image encoding methods using a wavelet transform have been studied.
A brief description will now be given of the wavelet transform. A sub-band division shown in FIG. 1 can be performed on an analyzing side of a filter bank shown in FIG. 2. On the analyzing side of the filter bank, image data is filtered by a low-pass filter H0 and a 2:1 sub-sampling (.dwnarw.2) is performed in a horizontal direction. The processed image data is filtered by a high-pass filter H1 in a horizontal direction and a 2:1 sub-sampling (.dwnarw.2) is performed in a vertical direction. An image is restored by processing the obtained frequency-band signals on a synthesizing side of the filter bank. On the synthesizing side of the filter bank, a 2:1 up-sampling (.uparw.2) and filtering by a low-pass filter F0 or a high-pass filter F1 are performed on each of the frequency band signals in a vertical direction, and a 2:1 up-sampling (.uparw.2) and filtering is performed by the low-pass filter F0 or the high-pass filter F1 in a horizontal direction. An encoding method in which a signal is divided into frequency bands is generally referred to as a sub-band encoding. The encoding using a wavelet transform is regarded as a sub-band encoding as described below.
In the wavelet transform, a multilayer band division is performed by repeating the above-mentioned sub-band division on a low-band signal (a signal LL shown in FIG. 1). Such a band division is referred to as an octave division. If a division is performed for three layers, ten sub-bands are obtained as shown in FIG. 3. In FIG. 3, reference numerals 8 to 10 indicate sub-bands in the lowest layer, 5 to 7 indicate sub-bands in the middle layer, and 1 to 4 indicate sub-bands in the highest layer.
The frequency-band signals obtained by the above-mentioned division include signals (wavelet coefficients) LL, LH, HL and HH as shown in FIG. 1 or FIG. 3. In FIG. 3, the signal LL 1 is a signal on a low-band side in both the horizontal and vertical directions. The signal LL 1 is obtained by filtering image data by a two-dimensional low-pass filter (LPF). Hereinafter, the signal LL may be referred to as a "low-band component." The signals HL 3, 6 and 9 are obtained by filtering by a high-pass filter (HPF) in the horizontal direction and filtering by a low-pass filter in the vertical direction. An edge component in the vertical direction appears in the signal HL. Hereinafter, this edge component may be referred to as a "vertical component." The signals LH 2, 5 and 8 are obtained by filtering by a low-pass filter in the horizontal direction and filtering by a high-pass filter in the vertical direction. An edge component in the horizontal direction appears in the signals LH 2, 5 and 8. Hereinafter, this edge component may be referred to as a "horizontal component." The signals HH 4, 7 and 10 are obtained by filtering by a high-pass filter in both the horizontal direction and the vertical direction. An edge component in a diagonal direction appears in the signals HH 4, 7 and 10. Hereinafter, this edge component may be referred to as a "diagonal component".
It should be noted that in the wavelet transform, the above-mentioned filters are designed so as to satisfy a condition for completely restoring a signal, that is, to satisfy an orthogonal condition and a normal condition. Additionally, when a filter having a number of taps longer than the dividing number 2 in each band division is used, a block distortion may be prevented since overlapping of basic waveforms occurs. Further, as only the low band is repeatedly subjected to the band division, a shorter basic waveform is used as the frequency increases. Thus, it can be expected that a mosquito noise caused by a quantization distortion of a high-frequency component does not spread spatially.
A description will now be given of a conventional method for encoding wavelet coefficients. First, each wavelet coefficient is quantized into a scalar quantity. The quantization process uses a linear quantization for each band. Herein, an adaptive quantization in which the quantization process is switched for each band is not considered. The coefficients are encoded subsequent to the quantization. As for the method for encoding, there is a method suggested in "9.6-kb/s Picture Coding Using Wavelet Transform", a thesis of the 1993 Institute of Electronics, Information and Communication Engineers (IEICE) Spring Conference, D-262, 7-23. With respect to the quantized coefficients, one coefficient is taken from one of the bands LL, LH, HL and HH in the highest layer (in a case of three-times division). Four coefficients (2.times.2) are taken from each of the bands LH, HL and HH in the middle layer. Sixteen coefficients (4.times.4) are taken from each of the bands LH, HL and HH in the lowest layer. These coefficients are arranged in the same positional relationship so as to form an 8.times.8 block shown in FIG. 4. The sixty-four coefficients in the 8.times.8 block are scanned in an order shown in FIG. 5 so as to obtain a one-dimensional string of coefficients. Then, the coefficients are subjected to a two-dimensional Huffman encoding with [run length of zero run, significant coefficient]. However, only the DC component (LL coefficient of the highest layer) is subjected to a DPCM. As is apparent from FIG. 5, the scanning is performed from the highest layer toward the lowest layer. In the same layer, the canning is performed in the order of the vertical component (HL), the horizontal component (LH) and the diagonal component (HH). Additionally, the vertical components are scanned in the vertical direction, the horizontal components are scanned in the horizontal direction and the diagonal components are zigzag scanned.
Additionally, "Wavelet Transform Coding for Picture (2) "Scanning Method suitable for Wavelet Basis--", a thesis of the 1993 Institute of Electronics, Information and Communication Engineer (IEICE) Spring Conference, D-336, 7-46, discloses another conventional method. In this method, 8.times.8 coefficients in a block are formed in a quad-tree structure by forming a single node by three coefficients (LH, HL, HH) located in the same spatial position in each layer, and scanning for a sub-tree is stopped at EOS (end of sub-tree) code. The node in the quad tree is scanned by a depth-first method, and is encoded as a one-dimensional string of the EOS or coefficients.
The conventional method described with reference to FIG. 5 is well adapted to the method in which the DCT coefficients in a 8.times.8 pixel block are zigzag scanned to form a one-dimensional coefficient string and the coefficient string is encoded by the two-dimensional Huffman encoding with [run length of zero run, significant coefficient]. However, this method is not always suitable for a characteristic of the wavelet transform coefficient. In a case in which the DCT coefficients are zigzag scanned, coefficients corresponding to frequency components in the same area (8.times.8) are scanned in a direction from a lower frequency to a higher frequency. Accordingly, energy is concentrated into a low-frequency component, and a large significant coefficient appears in the low-frequency component. On the other hand, a frequency of appearance of zero is increased on the high-frequency side, and a number of small significant coefficients is increased. Thus, a large number of short zero runs tend to appear and a small number of long zero runs tend to appear on the high-frequency side. According to such local concentration in appearance of zero run lengths, a number of objects to be encoded is decreased, and an amount of data can be decreased.
Also in the 8.times.8 block coefficients by the wavelet transform, the same tendency can be expected. However, since the 2.times.2 or 4.times.4 coefficients in each of the second and third layers are spatially adjacent to each other, an expected value of the coefficient is not decreased as the scanning progresses. Accordingly, as compared to the case of DCT coefficients, the significant coefficients do not always collectively appear in the beginning of the scanning. Rather, the significant coefficients are spatially maldistributed. That is, in each of the second and third layers, the zero coefficient tends to appear in a portion lacking energy of a directional component. On the contrary, in each of the second and third layers, the significant coefficient tends to appear in a portion having energy of a directional component. Accordingly, when a scan is performed from a higher-order position to a lower order position as is in the conventional method, an area in which significant coefficients are present is traversed in the second and third layers. Accordingly, a relatively long zero run tends to frequently appear in the second and third layers. Thus, an effect of data compression due to zero runs appearing in a particular area is reduced.
Additionally, in the wavelet transform, each of the vertical and horizontal edges in an image has a large significant coefficient in the corresponding direction thereof, and a directional component of a direction other than the corresponding directions of the edge rarely appears. In such a case, if scanning is performed in a direction from a higher position to a lower position without separating in each directional component, that is, if the coefficients in the same layer are scanned, for example, in the horizontal direction, the vertical direction and a diagonal direction, in that order, the scanning is performed by traversing a directional component having a small number of significant coefficients and a directional component having a large number of significant coefficients. Thus, a small number of long zero runs and a large number of short zero runs do not appear but a relatively long zero run frequently appears. This may decrease an encoding efficiency.
In the conventional method described with reference to FIG. 6, the 8.times.8 coefficients are formed in a quad-tree structure, and scanning is performed by a depth-first method. Accordingly, although spatially localized significant coefficients are not traversed, the scanning is performed by traversing significant coefficients that localize in different directional components.
Additionally, in both conventional methods, the coefficients in the second and third layers are 2.times.2 and 4.times.4 coefficients spatially adjacent to each other, respectively. Thus, a pattern having a strong correlation appears in a direction of each directional component. However, if a one-dimensional scan is performed, the above-mentioned nature may not be efficiently utilized.