This application claims the priority of Korean Patent Application No. 2003-90032, filed on Dec. 11, 2003, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present invention relates to digitally encoding a moving picture, and more particularly, a method of removing noise in digital moving picture data.
2. Description of the Related Art
Since digital moving picture data takes up a large of data, the data should be compressed in order to be transmitted and stored efficiently. Thus technologies for improving the reproduced image quality of a video signal and improving data compression rate are essential for enhancing the competitiveness of related products. Compression algorithms such as moving picture experts group (MPEG)-2 have been adopted as compression methods in digital multimedia products requiring high definition, such as high definition TV (HDTV) and digital versatile discs (DVDs).
Due to the technical limits of practical image input systems, of transmission channels, and of transmitters and receivers that handle the moving picture data, noise is inevitably generated during the obtaining, transmitting and receiving of the moving picture data.
The noise can be defined as a set of pixels whose brightness changes abruptly in time and which are spatially distributed randomly (they do not form a set of meaningful geometrical shapes). Mosquito noise is a type of random noise that occurs along edges in images that have been compressed using DCT.
If we plot the intensity curve in temporal space, of the image brightness at a given spatial position, a noise pixel can be observed as an abrupt transition in the intensity curve. A straightforward approach to reduce noise is to use some kind of temporal averaging filtering techniques to remove these abrupt transitions. However, a fast moving pixel (e.g., part of a moving object in a scene) also shows a similar behavior to noise, i.e. the intensities of a fast moving pixel change very sharply during a short time period. Simple averaging techniques will result in a blur or even lost of the motion objects in the restored scene. Thus separating the motion effects from noise effects becomes a challenge.
In principle, three categories of pixels need to be dealt with: static pixels, moving pixels, and noisy pixels. A basic difficulty is to avoid treating moving pixels as noisy pixels. If motion estimation fails for fast motion pixels, such as the fast moving object in a scene, then it will incorrectly filter these pixels as if they were noise related. If we could correctly classify pixels into the three categories then we may be able to avoid some expensive computation (avoid motion estimation for static pixels) as well as improve the filtering.
The noise included (e.g., compressed) in an image (frame) sequence, (in the digital moving picture data), degrades the reproduction image quality and the compression rate because the noise is recognized as a (radio frequency) component of a signal. Noise increases the bandwidth required for digital representation of images. Since the noise is recognized also as data, the valid picture data must be compressed more in order to satisfy a given bit rate; however, further compression causes increased coding artifacts such as blocking artifacts.
The compression rate of the image sequence can be improved greatly and the image quality can be improved by compressing the data (according to the MPEG-2 method) after the noise is removed in advance using an effective noise reduction algorithm. There are conventional noise reduction algorithms such as spatial filtering, temporal filtering, and spatio-temporal filtering.
Because spatial filtering is based on a stationary model, a contour of the reproduced image cannot be preserved after noise is removed. Although an additional contour-adaptive filter may be used, such are not very effective in a case where a threshold for discriminating a contour line is fixed as a constant without regard to the degree of noise, or in a case where a color spot is generated. A color spot is generated because different characteristics of channels, which construct a color filter array (CFA) on a charge-coupled device (CCD) sensor, caused by different brightness are processed in the same way regardless of the brightness.
Although spatial filtering is an effective way of removing noise when a still image is processed. if a video image sequence is filtered using the spatial filtering method, the degree of noise removal can differ for each frame. The differences in the degree of noise removal can be represented as a flickering phenomenon when the video data is reproduced as a video image sequence. Thus, in order to improve the performances of the conventional spatial filtering process, the threshold value for determining the contour line according to noise energy (more specifically, noise energy variance) should adaptively change, and in order to remove the color spots and flickering phenomenon, a temporal filtering method is used in addition to the spatial filtering operation.
In temporal filtering, a motion compensation method is used. However, in order to take the motion compensation into consideration, the motion of a subject should be estimated for each frame, the filtering should be performed along a motion trace, and more calculation operations are additionally required to estimate the motion.
A temporal filtering method based on a motion detection algorithm has been used to reduce the error in motion compensation and to reduce calculation load. The efficiency of this temporal filtering based on a motion detection algorithm is largely dependent upon the robustness of the motion detection algorithm. However, since the motion in the color image is determined by brightness difference according to the typical motion detection algorithm, the algorithm cannot recognize motion indicated by color difference. Thus, when the brightness difference between the subject and background is not large enough, errors in detecting the motion may occur.
In order to solve the above problems, there is a method of detecting motion by detecting differences in magnitudes (brightness) and angles (color) of RGB vectors by consideration of vector characteristics of the color image. This method can be used regardless of the calculation load and memory capacity when an input signal of the system is an RGB signal or the algorithm is processed by software; however, there is a limit to this method's practicality when the algorithm is processed using hardware.
The YCbCr domain (hereinafter, referred to as the YCbCr color space) is the most commonly used color coordinate system as it is useful for compatibility with monochrome video and interoperability with professional video processing equipment. Luminance (Y) contains most of the spatial information to which the human eye accords importance due to its sensitivity to detail. The chrominance channels (Cb, Cr, blue-difference and red-difference respectively) add color information. Where most of the input signals are signals of such YCbCr form, input signals of YCbCr form have to be converted into RGB color space signals in order for the RGB vector calculation operation to be performed. In addition, since non-linear calculation is required to calculate an inverse function of a cosine or a sine function for calculating the difference of RGB vector angles (colors), the hardware's complexity increases.
When motion is detected using the differences in angles (color) of RGB vectors, it is difficult to set a threshold value to be used as a reference to discriminate the detected signals. The threshold value greatly affects the performance of filtering, and thus, if the threshold value is too large, an artifact such as a motional residual image can be generated, and if the threshold value is too small, the noise is not removed. For example, in the motion detection methods using the differences in angles (color) of the RGB vectors, when an input value of a cosine function is nearly 0 (zero), a difference between output values is small, and thus the threshold value should be set precisely down to decimal points to detect the movements effectively. In order to overcome the difficulties in determining the threshold value and remove the noise effectively, relatively large numbers of frames must be processed. Thus, a large capacity of memory is required, and the hardware complexity increased.
Besides the above problems, in the temporal filtering based on motion detection, a threshold value of motion detection should adaptively change according to scene detection. Otherwise, if a general sequence including scene data is processed using only motion detection, in a case where the brightness or the colors of the pixels included in adjacent frames of different scenes are similar to each other, the different scenes may be filtered and different scenes can be mixed in a filtered frame. Therefore, to improve the performances of conventional temporal filtering based on motion detection, a scene change detection algorithm should be additionally used.
The conventional spatio-temporal method of filtering combining the above two filtering methods is a way that expands the spatial filtering to the temporal domain. Although noise can be removed effectively, the method includes the limitations of both temporal filtering and spatial filtering.