(1) Field of the Invention
The present invention relates to a moving image coding apparatus and method which are applied to moving images in a videophone, a tele-conference, and the like.
(2) Description of the Related Art
The videophone is subject to transmission constraints such as a low bit rate and an error environment (environment under which data transmissions are highly susceptible to errors). Thus, for improving the image quality, moving image coding typically employed by the videophone involves an appropriate selection of intra-frame coding which utilizes the correlation of adjacent pixel levels within an image for coding, and inter-frame prediction coding which utilizes pixel correlation of a past frame to a current frame for coding, such that suitable coding is applied to each area.
The intra-frame coding is characterized by a high error immunity but a large amount of codes generated therein. On the other hand, the inter-frame prediction coding is characterized by a low error immunity but a reduced amount of codes generated therein. As such, there is a trade-off relationship between a degraded image quality due to transmission errors and a coding efficiency. To cope with this situation, moving images are coded for the videophone on the assumption that:
(1) the inter-frame prediction coding is not basically performed except for the first frame;
(2) the intra-frame coding is performed on a periodic basis for eliminating accumulated errors; and
(3) the intra-frame coding is imposed if a certain quantitative condition is satisfied.
FIG. 1 generally illustrates the configuration of a conventional moving image coding apparatus which is employed in the videophone. The illustrated moving image coding apparatus comprises moving image input unit 1; blocking unit 2; discrete cosine transform unit 3; quantization unit 4; code conversion unit 5; de-quantization unit 6, inverse discrete cosine transform unit 7; frame memory 8; predicted image generator unit 9; motion detection unit 10; mode selection unit 11; subtractor 12; two-input, one-output switch 13; adder 14; and refresh map creation unit 800.
Moving image input unit 1 comprises a well known imager such as a CCD camera for capturing a desired image through the imager. Blocking unit 2 divides image data applied from moving image input unit 1 into blocks of m×n pixels (m, n are natural numbers) which are units for coding, and delivers image data (block data) for each block. The output of blocking unit 2 is supplied to one input of subtractor 12, one input of switch 13, and motion detection unit 10, respectively.
Subtractor 12 is also supplied with an output of predicted image generator unit 9 at the other input to subtract the output of predicted image generator unit 9 from the output of blocking unit 2. The output of subtractor 12 is supplied to the other input of switch 13. Switch 13 delivers one of the inputs in response to a control signal from mode selection unit 11. The output of switch 13 is supplied to discrete cosine transform unit 3.
Discrete cosine transform unit 3 applies known discrete cosine transform (DCT) to the output data of switch 13. The output of discrete cosine transform unit 3 is supplied to quantization unit 4. Quantization unit 4 quantizes a DCT coefficient which is the output of discrete cosine transform unit 3. The output of quantization unit 4 is supplied to code conversion unit 5 and de-quantization unit 6.
Code conversion unit 5 applies known variable length coding to the output data of de-quantization unit 6. Data delivered from code conversion unit 5, which is the output (coded data) of the moving image coding apparatus, is transmitted to a moving image decoding apparatus which is provided in the destination of the image data. De-quantization unit 6 de-quantizes the output data of quantization unit 4. The output of de-quantization unit 6 is supplied to inverse discrete cosine transform unit 7.
Inverse discrete cosine transform unit 7 applies known inverse discrete cosine transform (IDCT) to the output data of de-quantization unit 6. The output of inverse discrete cosine transform unit 7 is supplied to one input of adder 14. Adder 14 is supplied with the output of predicted image generator unit 9 at the other input to deliver the sum of the output of predicted image generator unit 9 and the output of inverse discrete cosine transform unit 7. The output of adder 14 is supplied to frame memory 8.
Frame memory 8, which sequentially stores output data from adder 14, can store one frame of image data. Image data stored in frame memory 8 is supplied to predicted image generator unit 9 and motion detection unit 10. Motion detection unit 10 detects motions from one frame to the next from the block data supplied from blocking unit 2 and the image data supplied from frame memory 8, and supplies the detected result (motion vectors) to predicted image generator unit 9. Motion detection unit 10 further calculates error power between corresponding blocks from the block data from blocking unit 2 and the image data from frame memory 9, and supplies the calculation result to refresh map creation unit 800.
Predicted image generator unit 9 generates a predicted image from the frame data supplied from frame memory 8 and the motion vectors supplied from motion detection unit 10. Refresh map creation unit 800 creates a refresh map, which indicates whether or not a forced refresh (forcedly performed intra-frame coding) should be applied to data in each of blocks divided by blocking unit 2, based on the error power supplied from motion detection unit 10. A forced refresh flag is set only for a block which has the error power higher than a predefined threshold. The refresh map created by refresh map creation unit 800 is supplied to mode selection unit 11.
Mode selection unit 11 controls a switching operation of switch 13 in accordance with the refresh map supplied from refresh map creation unit 800. Specifically, mode selection unit 11 imposes switch 13 to select the output of blocking unit 2 for a block for which the forced refresh flag is set, and imposes switch 13 to select the output of subtractor 12 for a block for which the forced refresh flag is not set. When the output of blocking unit 2 is selected, the intra-frame coding is performed, whereas when the output of subtractor 12 is selected, the inter-frame prediction coding is performed.
Next, specific description will be made on the operation of the moving image coding apparatus described above. The following description on the operation is made on the assumption that image data is applied from image input unit 1 in time series in the order of frame A, frame B, frame C, . . . Assume also that the refresh map is initialized (the forced refresh flag is not set for any block) at the start of coding (at the time image data of frame A is applied).
As image data of first frame A is applied, blocking unit 2 divides the input image data into a plurality of blocks, and sequentially delivers data of the respective blocks. Since the refresh map is initialized at the time the image data of frame A is applied, switch 13 is controlled by mode selection unit 11 to select the output of blocking unit 2 as it is for all block data. It should be noted that although the output of blocking unit 2 is also supplied to motion detection unit 10, motion detection unit 10 does not detect motion vectors or calculate the error power because frame memory 8 does not store any image data of corresponding past frames.
Each block data delivered from switch 13 is discrete-cosine-transformed by discrete cosine transform unit 3, quantized by quantization unit 4, and then supplied to code conversion unit 5 and de-quantization unit 6, respectively. Code conversion unit 5 performs a code conversion for the quantized data of each block supplied from quantization unit 4. De-quantization unit 6 in turn de-quantizes the quantized data of each block supplied from quantization unit 4. The de-quantized data is inversely discrete-cosine-transformed by inverse discrete cosine transform unit 7 to thereby restore an original image. Then, this restored image is stored in frame memory 8 as a reference frame for use in the coding of image data of the next frame B.
Next, as image data of frame B is applied, blocking unit 2 divides the input image data into a plurality of blocks, and sequentially delivers data of the respective blocks. Subsequently, motion detection unit 10 detects motion vectors and calculates the error power for each block from each block data of the current frame B delivered from blocking unit 2 and each block data of the past frame A stored in frame memory 8. Then, predicted image generator unit 9 generates a predicted image associated with each block from each block data of frame A supplied from frame memory 8 and the motion vectors of each block supplied from motion detection unit 10, while refresh map creation unit 800 creates a refresh map related to each block data of frame B divided by blocking unit 2 based on the error power supplied from motion detection unit 10.
As refresh map creation unit 800 creates the refresh map, mode selection unit 11 controls switch 13 to select one of the inputs in accordance with the created refresh map. Switch 13 selects the output of blocking unit 2 for a block for which the forced refresh flag is set, and selects the output of subtractor 12 (which is generated by subtracting the predicted image generated by predicted image generator unit 9 from the output of blocking unit 2) for a block for which the forced refresh flag is not set.
Each block data delivered from switch 13 is discrete-cosine-transformed by discrete cosine transform unit 13, quantized by quantization unit 4, and then supplied to code conversion unit 5 and de-quantization unit 6, respectively. Code conversion unit 5 performs a code conversion for the quantized data of each block supplied from quantization unit 4. De-quantization unit 6 in turn de-quantizes the quantized data of each block supplied from quantization unit 4. The de-quantized data is inversely discrete-cosine-transformed by inverse discrete cosine transform unit 7, and added to the predicted image generated by predicted image generator unit 9, thereby restoring an original image (frame B). Then, this restored image is stored in frame memory 8 as a reference frame for use in the coding of image data of the next frame C.
Likewise, for image data of frame C, a refresh map is created in a similar procedure to frame B as mentioned above, and switch 13 is imposed to switch the inputs for each block in accordance with the created refresh map.
Other than the moving image coding apparatus described above, there is a moving image coding apparatus as described in JP-A-2000-201354. This moving image coding apparatus creates a refresh map which gives a priority for refreshing to each block of input image data. Intra-frame coding and inter-frame prediction coding are switched in accordance with the refresh map. Specifically, the moving image coding apparatus involves refreshing a block with a higher priority at a shorter period (at which the intra-frame coding is performed), and refreshing a block with a lower priority at a longer period.
The priority for refreshing is determined by calculating a block feature amount (importance in improvement on image quality), which represents the proportion in which each block of input image data includes visually important information such as contours, or the degree at which a degraded image quality is subjectively perceivable (subjective evaluation importance level), and comparing the block feature amount with a preset threshold. The block feature amount may be represented by an amount indicative of the power of edge components produced by image processing which uses, for example, a high pass filter and another edge extraction filter.
However, the conventional moving image coding apparatuses described above imply the following problems, respectively.
In a part having a high subjective evaluation importance level, for example, in a part which forms expressions such as eyes, nose, mouth and the like, a degraded image quality is subjectively more perceivable. To provide an image having a high subjective evaluation, it is necessary to refresh a part having a high subjective evaluation importance level at a shorter period to prevent the characteristics of coding from degrading. In the moving image coding apparatus illustrated in FIG. 1, the determination as to whether or not refreshing is performed is collectively made for all blocks, so that this coding apparatus fails to provide an image quality having a high subjective evaluation, though a quantitative improvement can be expected on the image quality.
The moving image coding apparatus described in JP-A-2000-201354 must detect edge components with the aid of a high pass filter and another edge extraction filter to calculate the power. This moving image coding apparatus is disadvantageous in complicated image processing involved in the detection of edge components and a higher cost due to the filter used for detecting edges.