1. Field of the Invention
The present invention relates to a method and an apparatus for encoding an image, a medium for recording an image signal, as well as a method and an apparatus for decoding an image, and more particularly to a system arranged to convert an image signal of a moving picture into storage codes and record the corresponding storage codes on an image recording medium such as an optical disk, a magnetic disk or a magnetic tape as well as a method and an apparatus for encoding an image, an image signal recording medium, and a method and an apparatus for decoding an image which can be all used in a system for transmitting an image signal of a moving picture through a transmission path.
2. Description of the Related Art
As one of the encoding/decoding systems for the purpose of compressing a digital signal, there has been proposed such a sort of method as being arranged on a band division caused by a wavelet filter or a subband filter. This sort of method is arranged to perform a plurality of filtering operations having respective passage bands with respect to an inputted signal, sub-sample the filtered signal at intervals corresponding to the band widths. The shift of energy of the output signal from each filter is used for compressing an image.
FIG. 1 shows a basic arrangement for band division and synthesis based on the wavelet filter or the subband filter. Herein, an input signal is a one-dimensional signal x [i].
In FIG. 1, a numeral 300 denotes a divider, which provides an analytic lowpass filter 301 for dividing a band and an analytic highpass filter 302 for dividing a band. These two filters 301 and 302 are served to divide the input signal into a low frequency band signal XL[i] and a high frequency band signal XH[i]. Numerals 330A and 330B denote subsamplers, which operate to perform a thinning operation at a sample unit as indicated in the following expressions (1) and (2) with respect to the band-divided signals XL[i] and XH[i]. EQU XL[j]=XL[i], j=i/2 (1) EQU XH[j]=XH[i], j=i/2 (2)
A numeral 400 denotes a synthesizer, which provides up-samplers 431A and 431B, a synthesizing lowpass filter 411, a synthesizing highpass filter 412, and an adder 436. In operation, the up-samplers 431A and 431B are served to double a sampling interval in length and insert a sample with a zero value at a center of the sampling interval as indicated in the following expressions (3) and (4). ##EQU1##
The synthesizing lowpass filter 411 and the synthesizing highpass filter 412 perform an interpolation with respect to the band signals XL[i] and XH[i]. The interpolated signals are added through the adder 436 for restoring the input signal x[i].
The analytic lowpass filter 301 and the analytic highpass filter 302 located in the divider 300 and the synthesizing lowpass filter 411 and the synthesizing highpass filter 412 located in the synthesizer 400 are all composed to fully or approximately meet the relations of the following expressions (5) and (6). EQU H.sub.0 (-z)F.sub.0 (z)+H.sub.1 (-z)F.sub.1 (z)=0 (5) EQU H.sub.0 (z)F.sub.0 (z)+H.sub.1 (z)F.sub.1 (z)=2z.sup.-L (6)
H.sub.0 (z), H.sub.1 (z), F.sub.0 (z), and F.sub.1 (z) are transfer functions for the analytic lowpass filter 301, the analytic highpass filter 302, the synthesizing lowpass filter 411, and the synthesizing highpass filter 412, respectively. L is any integer. This constraint condition guarantees that the output signal X"[i] from the adder 436 located in the synthesizer 400 completely or approximately coincides with the input signal x[i].
In the case of applying the foregoing band division or synthesis based on the wavelet filter or the subband filter to an encoding operation, the encode/decode process is executed between the subsamplers 330A and 330B and the up-samplers 431A and 431B, respectively. The arrangement shown in FIG. 1 divides the input signal into two bands. In actual, for encoding for the purpose of data compression, each band may be recursively divided twice or three times for achieving more efficient compression.
FIGS. 2 and 3 show the band division and synthesis based on the wavelet filter and the band division and synthesis based on the subband filter.
In FIG. 2, a numeral 500 denotes an encoder for dividing a band through the wavelet filter. In the encoder 500, an analytic lowpass filter 501A and an analytic highpass filter 502A are served to divide the input signal x[i] into a low frequency band XL0[i] and a high frequency band XH0[i]. Then, the low frequency band signal XL0[j], which is subsampled like the operation of the expression (1) by a subsampler 530A, is further band-divided through a second analytic lowpass filter 501B and a second analytic highpass filter 502B and then is subsampled by subsamplers 530C and 530D.
On the other hand, the high frequency band signal XH0[i] passed through the analytic highpass filter 502A at a first stage is subsampled by the subsampler 530B and then is applied into a delaying unit 537 for taking synchronization with the low frequency band signal. The first-stage high frequency band signal XH0[j] delayed by the delaying unit 537 and the high frequency band signal XH1[k] and the low frequency band signal XL1[k] sabsampled by the second-stage subsamplers 530C and 530D are applied to the corresponding quantizers 532A, 532B and 532C. Those signals are quantized at the quantizing steps QHO, QH1 and QL1 according to the following expressions (7), (8) and (9). ##EQU2##
The quantized data XL1'[k], XH1'[k], and XH0'[j] are then applied into a reversible encoder/multiplexer 534, in which the reversible encoding such as the Huffman coding or the arithmetic coding as usual and the multiplexing are executed with respect to the quantized data. Then, the results are sent to a decoder 600 shown in FIG.3 through a storage medium 535 and a transmission path 536.
In FIG. 3, the decoder generally indicated at 600 performs a wavelet synthesis. In the decoder 600, a de-multiplexer/reversible decoder 635 performs a decoding process against the multiplexing and the reversible encoding executed in the encoder 500 for restoring the data XL1'[k], XH1'[k], and XH0'[j]. The restored data XL1'[k], XH1'[k], and XH0'[j] are applied to the corresponding de-quantizers 633A, 633B and 633C. Those de-quantizers 633A, 633B and 633C perform the reverse conversion to the quantizers 432A, 432B and 432C. The reverse conversions of the de-quantizers correspond to the expressions (10), (11) and (12), respectively. EQU XL1"[k]=XL1'[k].times.QL1 (10) EQU XH1"[k]=XH1'[k].times.QH1 (11) EQU XH0"[j]=XH0'[j].times.QH0 (12)
The low frequency band signal XL1"[k] and the high frequency band signal XH1"[k] derived at the second stage division are applied into the up-samplers 631A and 631B. The high frequency band signal XH0"[j] at the first stage division is applied into a delaying unit 637 for delaying the signal by a time required for re-composing the low frequency band signal XL0"[j] at the first stage division.
The low frequency band signal XL1"[j] and the high frequency signal XH1"[j], which are subject to the same up-sampling process as the expressions (3) and (4) by the up-samplers 631A and 631B, are applied into synthesizing lowpass filer 611A and synthesizing highpass filter 612A that have the relations of the expressions (5) and (6) with the analytic lowpass filter 501B and the analytic highpass filter 502B, respectively. The outputs of the filters 611A and 612A are applied into an adder 636A for adding them. Then, the added result is made to be the signal XL0"[j] corresponding to the low frequency band signal XL0[j] obtained by the first-stage division in the encoder 500.
The first-stage low frequency band signal XL0"[j] and the first-stage high frequency band signal XH0"[j] delayed by the delayer 637 are up-sampled by the up-samplers 631C and 631D, respectively. The up-sampled signals are interpolated through the effect of the synthesizing lowpass filter 611B and the synthesizing highpass filter 612B and then are added to one signal by an adder 636B. The added signal is made to be a regenerative signal x"[i] corresponding to the input signal x[i].
In turn, the description will be oriented to an apparatus for encoding a moving image through the wavelet conversion and an apparatus for decoding a moving image therethrough with reference to FIGS. 4 and 5.
In an encoder generally indicated at 700 in FIG. 4, a motion vector detector 711 operates to detect a motion vector v from the input image stored in a frame memory 712. The method for detecting a motion vector normally employs a block matching system at a block unit consisting of 16.times.16 pixels in vertical and horizontal. In place, for realizing higher precision, the matching at a half-pixel unit may be employed.
A numeral 703 denotes a motion compensator provided with a frame memory (not shown). The motion compensator 703 operates to predict a pixel value at each location of an image to be encoded based on the images which have been already encoded and decoded and then stored in the frame memory. The predicted value I'[i, j, t] of the pixel value I[i, j, t] at the location (i, j) on the image inputted at a time point t is determined by the following expression (13) using the motion vector v-(vx(i, j, t), vy(i, j, t)) at that location. ##EQU3## wherein T denotes a difference between a time point when the image I being currently predicted and a time point when the image on the frame memory is inputted. The right side of the expression (13), I[i', j', t-T], I[i'+1, j', t-T], I[i', j'+1, t-T], and I[i'+1, j'+1, t-T], represents the pixel value on the frame memory (not shown). int(x) represents a maximum integer value that does not exceed x.
A numeral 790 denotes a subtracter, which operates to calculate a difference between a value of a pixel to be encoded and a predicted value calculated by the motion compensator 703. A wavelet converter 714 performs a wavelet conversion with respect to the difference calculated by the subtracter 790. A quantizer 715 performs a quantizing process as indicated in the following expression (14) using a proper step size Q with respect to a wavelet coefficient c obtained by the wavelet converter 714. EQU c'=int(c/Q) (14)
The wavelet coefficient quantized by the quantizer 715 is supplied to a variable-length encoder 716 and a de-quantizer 725. The de-quantizer 725 performs a de-quantizing process as indicated in the following expression (15) using the same step size as that used in the quantizer 715. EQU c"=c'.times.Q (5)
The de-quantized data is applied into a wavelet inverter 724, where the data is de-waveletted for restoring the difference between the pixel values. The difference value is added to the predicted value outputted from the motion compensator 703 by an adder 791 for composing the data of the pixel value. Then, the pixel value data is sent to the motion compensator 703 in which the data is stored in the frame memory (not shown). The variable-length encoder 716 performs a variable-length encoding operation with respect to the wavelet coefficient quantized by the quantizer 715 and the motion vector v detected by the motion vector detector 711. The encoded data is outputted as a bit stream. Then, the bit stream is transmitted to the decoder 800 shown in FIG. 5 through a storage medium 726 and a transmission path 727.
On the other hand, FIG. 5 shows a decoder generally indicated at 800. The decoder 800 receives the bit stream generated by the encoder 700. At first, the bit stream is applied into a variable-length decoder 826 in which the reverse process to the variable-length encoder 716 included in the encoder 700 is performed with respect to the bit stream. The inversion makes it possible to restore the quantized wavelet coefficient and the quantized motion vector v from the bit stream. The wavelet coefficient is applied to a de-quantizer 825 and the motion vector v is applied to a motion compensator 803. The de-quantizer 825 and a wavelet inverter 824 are the same as those included in the encoder 700. The de-quantizer 825 and the wavelet inverter 824 perform the de-quantizing process and the wavelet inversion indicated in the expression (3) with respect to the inputted data for restoring each difference between the pixel values.
The difference is added to the predicted value generated by the motion compensator 803 for composing pixel value data by an adder 891, which leads to re-composition of the image corresponding to the image inputted to the encoder 700. Then, the re-composed image is outputted as a restored image. Each pixel value of the restored image is stored in a frame memory (not shown) provided in the motion compensator 803 for aiding in generating the predicted image.
The motion compensator 803 is the same as that included in the encoder 700. The motion compensator 803 operates to predict each pixel value of an image to be currently decoded by using the motion vector v obtained by the variable-length decoder 826 and the image stored in the frame memory (not shown) provided in the motion compensator 803. Then, each predicted pixel value is supplied to a subtracter 891.
The foregoing description concerns with the arrangement for the inter-coding operation. If a large difference takes place between a value of a pixel to be currently encoded and a predicted value given by the motion compensator 703, for preventing increase of an encoded bit quantity, the below-indicated intra-coding operation may be executed. That is, the operation is executed to send to the wavelet converter the value derived by subtracting an average value (average offset value) of luminance values in one block from each pixel in the block as a difference image and send the average value of the luminance value (average offset value) to the variable-length encoder 716 in place of the motion vector.
The wavelet encoding system, in general, utilizes the grammar as indicated in FIGS. 6A and 6B for generating an encoded bit train (see a system diagram of multiplexing process). In addition, FIGS. 6A and 6B illustrate part of the process executed by the variable-length encoder 716. FIG. 6A shows the process for generating the encoded bit train, while FIG. 6B shows a process about the output of the motion vector or the average offset value. Though the discrete cosine conversion (DCT) used for the MPEG1 or MPEG2 is executed to calculate a conversion coefficient at a block unit, in general, the encoding system based on the wavelet conversion is executed to calculate a quantizing wavelet coefficient over the overall image in place of each block. Hence, as shown in FIG. 6A, about the image to be encoded, all the encoded vectors (for the inter-coding) or the average offset value (for the intra-coding) are outputted and then all the encoding wavelet coefficients are outputted. According to the grammar shown in FIG. 6B, about a unit block at which the motion vector is detected and the motion is compensated, at first, a flag indicating the number of vectors is outputted. Since the intra-block has no vector, for the intra-block, the average offset value corresponding to the dc component of the wavelet coefficient is outputted as motion information. If an inter-block has one or more vectors (two or more vectors are allowed in one block in some cases), the encoded vector derived by encoding the motion vector detected for the area is outputted as motion information.
In the encoding method based on the band division done by the wavelet filter or the band division done by the subband filter, the quantization after the wavelet conversion or the subband conversion brings about the quantizing noises mainly in the high frequency component, so that the image formed by doing the wavelet inversion or the subband inversion after the de-quantization may disadvantageously bring about a ringing phenomenon. In particular, when performing an intra-image prediction, though lots of still areas included in the difference image is meaningless, the influence of the subsampling caused by two or more divisions results in increasing taps in number relatively as the band division is repeated more and more. Hence, the meaning values around the moving area are diffused into the meaningless values of the still area, so that the ringing disadvantageously takes place in a wide area of the resulting image.