The present invention relates generally to noise reduction in digitized pictures and specifically to the reduction of noise in a video signal.
Noise reduction of a video signal is used to enhance the quality of images comprising the video signal and to prepare for an efficient compression of the video signal. Noise reduction is important in connection with compression of image information, because noise may significantly reduce the effectiveness of compression schemes, particularly frequency-domain compression schemes such as MPEG-2. In image compression technology there is typically a trade-off between compression and image quality; increased compression may tend to reduce image quality. It is not always easy to reconcile these differences so as to achieve high quality highly compressed images. Effective noise reduction in connection with compression of a video signal may well serve both purposes and produce enhanced images in addition to a well-compressed video signal.
The technical background of compression and different state of the art preprocessing and compression techniques are described for example in John Watkinson, Compression in Video and Audio, Focal Press 1995, ISBN 0 240 51394, which is incorporated by reference.
A previously known video signal noise reduction system is described, for example, in U.S. Pat. No. 5,361,105 to Siu-Leong Iu, the content of which is hereby incorporated by reference. In this system, image pixels are tracked across multiple frames and are averaged to produce respective noise-reduced pixel values. In video signals representing a sequence of images that may change from frame to frame, e.g. in motion pictures and television, this system seeks to reduce noise by estimating the magnitude and direction of inter-frame motion in a sequence of image frames. The movement of an image is estimated by first calculating a trajectory vector for a block of picture elements (pixels) by comparing preceding and succeeding frames. The trajectory vector is then used to reduce noise in the video signal by averaging each of a plurality of pixels corresponding to a moving block along an estimated trajectory. The described method, however, does not eliminate noise to a satisfactorily sufficient degree. One drawback, for example, is that the trajectory estimation, which is a part of this noise reduction method, itself is sensitive to the noise.
Another system is disclosed in U.S. Pat. No. 4,987,481 to Spears et al., which is hereby incorporated by reference. Spears shows an apparatus for selectively reducing noise by non-recursively averaging video information contained in a sequence of video frames when the video information is found to be impaired by noise. A drawback with this method is the consequence of the averaging between video frames. The video information of a noise-reduced frame portion is calculated and original video information is lost even if it is correct and unimpaired by noise. Furthermore, with the above procedure there is a high probability for different pixel values to appear or be introduced in sequence, which in its turn leads to poorer compression of the images in the video stream.
The object of and the problem to be solved by the present invention is to reduce the noise in a video signal representing a sequence of video frames, in particular a video signal representing moving pictures. An aspect of this problem is to achieve an enhanced compression factor for a compression scheme applied to the video signal, or stated differently, to reduce the bit rate of the video signal. A further aspect is to achieve an enhanced perceived image quality, preferably combined with an enhanced (or at least not reduced) compression factor.
Accordingly, the result achieved by the invention can be considered a form of noise reduction, or alternatively, video signal optimization for compression. The method of the invention is operative to remove or xe2x80x9csmooth outxe2x80x9d minor differences in a video signal, ordinarily imperceptible or only marginally perceptible to the human eye, so that certain types of data compression can be performed on the video data more efficiently. In particular, the well-known MPEG compression scheme uses coefficients in the frequency domain to represent small eight-by-eight pixel regions of an image. Given a constant target bit rate, eliminating many of the high-frequency components devoted primarily to xe2x80x9cnoisexe2x80x9d will leave more bandwidth for the low-frequency components of the compressed MPEG video stream, thereby leading to a possible increase in perceived image quality.
The invention is based on the discovery that there is a relationship or correlation between video information of frame portions within an observed video frame (i.e. a spatial correlation) on one hand, and between video information of frame portions from a sequence of adjacent observed video frames (i.e. a correlation along the time axis of the frame sequence) on the other hand. The correlation is strong in sequences of frame portions where there is no local scene change or movement in the image. A local scene change in this context means that the video information changes significantly between sequential corresponding frame portions, i.e. over time.
For a two-dimensional video signal, the correlation analysis is carried out in three dimensions, i.e. with respect to the surrounding pixels that are adjacent to the current pixel in the two-dimensional spatial domain and in the time domain, respectively. In order to distinguish between a pixel that represents a local scene change (e.g. an edge of a moving object within the image or a cut to another scene) and a random pixel or noise spike, the correlation between the current pixel under consideration and its surrounding pixels is analyzed. Noise on the current pixel, if any, is suppressed on the basis of the correlation analysis. If there is a weak correlation, a local scene change is assumed to take place between compared frame portions, and no attempt is made to reduce noise on this pixel. If, on the other hand, a strong correlation is found between compared frame portions, typically two frame portions, those frame portions qualify for further processing in a selecting step, with possible subsequent noise reduction.
According to the invention, the noise on a pixel is preferably replaced by a maximum likelihood signal based on correlated pixels. Stated differently, from a subset of the values of correlated pixels, the pixel that has a value which is most likely to have a similar predecessor or successor is selected to replace the current pixel. Each video frame is processed under consideration of a sequence of video frames temporally adjacent to the current video frame. The temporally adjacent frames may in various embodiments be ahead of, come after, or surround the current video frame. In a preferred embodiment, the processing is performed for each pixel of a current video frame by observing a frame portion consisting of the current pixel and a number of surrounding, spatially adjacent pixels. Such a frame portion is called a slice S. Each current slice, or more specifically the video information of the slice, is compared to equal sized spatially corresponding slices of the adjacent frames. Such a set of temporally consecutive slices can be called a tube T. The tube T is analyzed in order to select the slices adjacent to the current slice that have a particularly strong correlation, i.e. the consecutive slices wherein there is no local scene change.
Having selected a set of temporally consecutive slices that do not have a local scene change, the current pixel is compared to the spatially corresponding but temporally adjacent pixels of the selected slices (a subset of the original slices) and the pixel that is most likely to have a preceding or subsequent pixel having the same value or video information content is selected as a new current pixel. In a different wording, an extreme pixel value is sorted out and is replaced by a pixel selected from a set of adjacent pixels and being judged to have a better correlation to the surrounding pixels. The selected pixel is then assigned to the current pixel.
In one embodiment, the pixel judged to have a desired correlation, and which is assigned to the current pixel, is a pixel having the median luminance value of the pixels in the selected slices forming a tube of slices that is defined by twoxe2x80x9d local scene changes (i.e. xe2x80x9cbeforexe2x80x9d and xe2x80x9cafter the current scene).
In one embodiment, an entirely new sequence of video frames is produced in the noise reduction process, whereas in another embodiment, the current video frame is replaced by the new frame which thus may contribute in the analysis and processing of the next current video frame.
When the invention is applied as a pre-processing tool for compression of an image signal, an object of the invention is to produce sequences of pixels having the same value, but without loss of significant image information. In advantageous cases, this leads to a number of subsequent similar or identical blocks of (e.g. 8xc3x978) pixels which, when encoded in a subsequent coding stage, are highly compressed.
In a further embodiment, account is also taken of the fact that the human eye has differing sensitivity to light and dark and to the different colors red, green and blue. Noise-impaired pixels are then discriminated with regard to a dynamically computed noise threshold value, such that the precision in noise and scene change detection is adapted to the characteristics of human vision. For example, detailed changes are hard to detect or are even undetectable in dark and in very bright areas. By taking the human eye sensitivity into consideration while computing the noise threshold value, the bit rate may be reduced further in regions of frequency spectra, or color space where human is less sensitive to small differences. An image can be achieved which is subjectively perceived by the eye as enhanced, even though video information may actually be missing or noise actually is left in parts of the picture.
In other embodiments, a predetermined or fixed threshold value is selected and used for a chosen selected range or for the whole frequency range of the picture.
Further advantages and details of the invention will be seen from the following description of an embodiment of the invention with the aid of the accompanying drawings and in connection with the independent and dependent claims.