This invention relates to mobile communication systems. More particularly, and not by way of limitation, the invention is directed to an apparatus and method for increasing coding efficiency with an adaptive pre-filter.
In mobile communication systems utilizing Mobile Equipment (ME), it is quite common to utilize a video recording playback feature. It is now possible to record a video clip or make a video telephony call over an ME. However, to accomplish these tasks, it is necessary to compress captured frame sequences. Currently, most existing video encoders are designed as a block-based motion-compensated hybrid difference/transform coder utilizing MPEG-4 or H.263 formats, where the transformation is accomplished by a Discrete Cosine Transform (DCT) on blocks of 8×8 pixels. To meet the demands for low bit-rates that exists in the mobile world today, these kinds of encoders mainly control the amount of bits allocated to each frame by changing the strength of the quantization. The quantization step divides the DCT coefficients with a fixed Quantization Parameter (QP). The quotient is then rounded to the nearest integer level and multiplied with the QP parameter to form a quantized coefficient. This quanitizaiton step has given rise to two main artifacts: blocking and ringing. Blocking artifacts are also due to Motion Compensation (MC), where it is the consequence of poor MC prediction and a combination of a relatively smooth prediction and coarsely quantized prediction error. The blocking artifact is perceived as an unnatural discontinuity between pixel values of neighboring blocks. The ringing artifact is perceived as high frequency irregularities around the edges in an image. Thus, the blocking artifacts are generated due to the blocks being processed independently and the ringing artifacts are caused by the coarse quantization of the high frequency components.
If the target bit-rate is fixed, the QP value chosen depends on the coding efficiency. A good coding efficiency results in a lower QP value. The main causes of decreased coding efficiency are that e.g. a camera sensor generates noise and that the captured sequence content is highly complex. The noise distortion from the sensor may be of a different characteristic, which affects the luminance or the color components and is usually increased in weaker light conditions. The complexity of the captured sequence depends on the amount of high frequency information and the fine details of the image, which are more difficult to predict for the encoder and thereby requires more bits to encode.
A pre-processing algorithm may be utilized prior to processing a video signal through an encoder, which may reduce the amount of camera disturbance and the complexity of the sequence, thereby increasing the coding efficiency. This may be performed, for example, by applying a low-pass filter on the input sequence. However, this results in smoothing of the entire frame and visually significant information, such as object edges, is lost. A pre-filter is required to preserve the visually significant information while removing or attenuating insignificant information, which results in an improved perceived video quality. Existing systems utilizing pre-filtering processing are limited compared to post-filtering processing. It has been suggested that a combined pre-post filter be utilized where the algorithm preserves the edges and filters (i.e., low pass) the non-edge region. To achieve the proper threshold in the post-filtering step, it is necessary to calculate the threshold on the encoder side, i.e. metadata and send it with the video data. Although this results in good video quality, this proposed system is not applicable for an ME in the cellular networks today because it is not possible to send this kind of meta data with the video data. In another approach, it has been proposed to utilize pre-processing in the rate-distortion framework. This is performed to increase the peak signal-to-noise ratio (PSNR) and reduce compression artifacts. However, this solution is far too complex for an ME. Additionally, in most MEs, it is not possible to use the rate-distortion framework since this involves an iteration of the encoding process. This proposed system also utilizes a Region-Of-Interest (ROI) to improve the perceived quality. However, in this pre-filter, the background outside the ROI is filtered with several Gaussian low-pass filters of different variance. By using several filters with their strengths based on the distance to the border of the ROI, the impact of border effects is decreased. For example, an ROI is found in the face of a person in a used sequence and is detected using a search for skin color. There are limitations to this proposed process because of the use of many ROIs and the differences of e.g. skin color which results in an incomplete solution.
To meet the requirement of an ME with low complexity and increased coding efficiency, a new approach is needed which utilizes the local variations in chrominance to determine the strength of low-pass filtering of the luminance. This approach decreases the complexity of the image because the amount of processed pixels is reduced. Thus, the coding efficiency is also increased because of high frequency components in textures with little variation in the chrominance are decreased as a result of the low-pass filtering.
It would be advantageous to have an apparatus and method which utilizes chrominance controlled video for increasing coding efficiency. The present invention provides such an apparatus and method.