The present invention relates to an image processing apparatus and method preferably applicable to a video communication apparatus in a video conference system or the like using video and audio data.
As image compression methods used in video communication apparatuses, high-performance coding methods based on DCT (Discrete Cosine Transformation) as in the ITU-T Recommendations H.261, H263 and the like are widely used. However, in application to a narrow-bandwidth communication environment such as the Internet, as the coding amount must be greatly reduced with a high compression rate, a problem occurs due to degradation of image quality even in use of these high-performance compression coding methods.
Accordingly, in a video conference system or the like, a method for satisfying subjective image quality has been developed. That is, in an obtained image of a person, a large coding amount is allotted to a face area which is the most important part of the image, and the coding amounts in the other areas are greatly reduced, so as to reduce the total coding amount. For example, Japanese Published Unexamined Patent Application No. Hei 7-203436 proposes a DCT-based image compression device which improves subjective image quality while suppressing the entire coding amount by recognizing a face area, selecting a plurality of quantization tables based on the result of recognition, and allotting a large amount of code data to the most important face area.
However, in case of coding amount control based on each image area by using the conventional DCT-based coding method, remarkable block distortion and/or mosquito noise occurs in an area determined as a part other than an important part. Accordingly, the subjective image quality is seriously degraded, and a decoded image seems unnatural. Further, a pseudo outline occurs in the border between an area determined as an important part and an area determined as a part other than the important part, and the obtained image seems further unnatural.
To solve the above problems, low-frequency filtering processing can be performed on an area determined as a non-face area (unimportant part). That is, prefiltering processing is performed to attenuate high frequency components in the unimportant part in advance, to suppress the coding amount in compression processing, and to reduce mosquito noise which occurs due to quantization on the high frequency components.
On the other hand, the receiving side performs decompression processing on bitmapped code data to image data. Then, the image data decompressed by the decompression processing is color-space filtered by postfiltering processing. In the postfiltering processing, removal processing is performed to remove block distortion which is remarkable in a highly-compressed image area determined as a non-face area, and adaptive filtering processing is performed to remove a pseudo outline occurs in the border between face and non-face areas.
As described above, when a quantization control function based on recognition of important area is added to a DCT-based image coding method, to suppress subjective image degradation, various additional correction processing such as adaptive filtering processings are required. Accordingly, if such processings are realized by software, processing time increases. Further, if the processing are realized by hardware, the circuit scale increases.