1. Field of the Invention
The present invention relates to a moving image processing device for compressing video images.
2. Description of Related Art
In recent years, highly efficient video compression techniques, such as MPEG-2 (ITU-T H.262) and H.264/MPEG-4 AVC standards, have made significant progress. These video compression techniques are used in the field of video cameras and recorders. In these fields, various techniques have been developed to achieve better image quality at lower bit rates. In these techniques, a low compression ratio is set for a viewer's attention region in an input image to improve the image quality while a higher compression ratio is set for the other region to reduce the bit rate. Various ways to compress images while protecting the portion of a person's face in particular as an attention region have been studied and proposed actively. In these techniques, a person's face is detected based on its characteristic color information and average luminance. These techniques, however, have a problem in that the region other than the face portion is misidentified as an attention region and protected if it has the same characteristics as those of the face portion.
On the other hand, face object detection techniques have been developed as new approaches. In these techniques, a person's face is regarded as an object, and the face is detected based on the components of the face such as eyes, a nose, and a mouth and the positional relationship between these components. These techniques have been used for auto-focusing and exposure control of optical systems in the fields of video cameras and still cameras. In these face object detection techniques, faces are detected with a high degree of accuracy. These techniques also are expected to be extended to face authentication techniques to identify a particular person. In view of this, proposals are beginning to be made to combine these face object detection techniques with the existing video compression techniques to obtain moving image processing devices for compressing persons' face portions with a high degree of accuracy while maintaining high image quality. A typical example of these devices is disclosed in JP 2005-109606 A. The basic configuration of this device as a conventional one is described below.
FIG. 13 shows a configuration of a conventional moving image processing device. Hereinafter, the operation of the conventional moving image processing device performed in the case where input image data includes a person's face is described. The conventional moving image processing device includes a video compressor 100, an image memory 101, and a face object detector 102.
Input image data to be compressed is given to the video compressor 100 and the image memory 101. Upon receiving the input image data from the image memory 101, the face object detector 102 starts a face object detection processing. Specifically, the face object detector 102 regards the portion of a person's face included in the input image data as an object, and detects the person's face based on the characteristic components of the face such as eyes, a nose, and a mouth and the positional relationship between these components. Then, it removes the background to extract and identify the face region, and outputs face detection information. FIG. 14 is a diagram showing conventional face detection information, and indicates that a face portion in input image data is detected properly.
A quantizer 103 receives the face detection information from the face object detector 102 at a point of time when a DCT (discrete cosine transformer) 108 starts processing one frame of data. In the quantization processing following the DCT, the face detection information is used to reduce the compression ratio of the face portion to a lower level than that of the other portion. JP 2005-109606 A, however, proposes neither a specific method nor detailed technique therefor.
Generally, the quantization processing to be performed by the quantizer 103 proceeds on a macroblock-by-macroblock basis. To start the quantization of one macroblock, the quantizer 103 judges whether or not the rectangular macroblock consisting of 16×16 pixels includes the region indicated by the face detection information. If the macroblock includes the face region, the macroblock is identified as a face protection macroblock. If the macroblock does not include the face region, the macroblock is identified as the other macroblock. When a current macroblock to be quantized is a face protection macroblock, the quantizer 103 subtracts a predetermined value for quantization index adjustment from a reference quantization index Q0. For example, if the predetermined value for quantization index adjustment is “3”, the value of Q0−3 is adopted as a quantization index Q. On the other hand, when a current macroblock to be quantized is not a face protection macroblock, the value of Q0+3 obtained by adding the predetermined value for quantization index adjustment to the reference quantization index Q0 is adopted as a quantization index Q.
FIG. 15 is a diagram showing a conventional face protection region. When the compression processing of one frame of data is completed, the region indicated by the face protection information is filled with face protection macroblocks. As a result, a face protection region as shown in FIG. 15 is formed. Since the quantization index Q (=Q0−3), which is smaller by “6” than that of the adjacent region, is used in this face protection region, the compression ratio of the region is lower accordingly. As a result, the compression distortion is reduced, and the obtained compressed image of the person's face as an attention region is high in quality.
Next, the correlation among processings performed on a frame-by-frame basis.
FIG. 16 shows a timing chart of conventional frame-based processings. As shown in FIG. 16, the face object detector 102 starts a face object detection processing at the time indicated by an arrow A when the first frame of input image data is completely stored in the image memory 101. The face object detection processing intended for the first frame of input image data takes a longer processing time than the time required to store the second frame of input image data in the image memory 101, and ends during the storage of the third frame of input image data in the image memory 101. The quantizer 103 receives the face detection information of the first frame at the time indicated by an arrow B in FIG. 16 when the compression processing of the fourth frame of input image data starts, and uses this face detection information for the quantization of frequency domain data prepared in the DCT section 108. That is, since the face object detection processing takes a long time, it cannot be performed for every frame but only intermittently. Nevertheless, the compression processing using the face detection information can be achieved, as shown in FIG. 16.
Finally, bit rate control performed in the case where a face protection region has a large area is described. In that case, a larger region is quantized by using the quantization index Q0−3, which is smaller than the reference quantization index Q0, and a smaller region is quantized by using the quantization index Q0+3. As a result, the amount of compressed data increases, the bit rate increases, and the amount of data that is stored temporarily in a buffer memory 111 also increases. When the bit rate of the compressed data is excessively high or the amount of data temporarily stored in the buffer memory 111 increases excessively, a bit rate controller 115 recalculates the reference quantization index Q0 and updates it to an appropriate value. Specifically, the bit rate controller 115 changes the reference quantization index Q0 to a larger value. Thereby, an appropriately high compression ratio can be achieved for the entire frame while maintaining the feature of the lower compression ratio of the face protection region than that of the other region, and consequently, an appropriate bit rate can be achieved. MPEG-2 and H.264/AVC also incorporate, as a standardized technique, a technique for adjusting the compression ratio with a reference quantization index Q0 to obtain an appropriate stream of data that does not cause a failure of a buffer of a decoder in a reproduction apparatus.