1. Field of the Invention
The present invention relates generally to video and image processing. In particular, the system described herein is designed to reduce additive noise in video sequences using three-dimensional non-linear filtering.
2. Description of the Related Technology
In the past decade applications of digital video have increased dramatically. These applications range from the use of digital video for cinemagraphic archiving, medical imaging, video storage and playback on DVDs. In addition, digital video also forms the basis for more efficiently transmitting television, via cable, over the air, and over the Internet.
The last application is especially important. It is based on video compression, a set of algorithms that are based on digital video. The algorithms achieve higher compression ratios than what would be capable through analog techniques, thereby reducing the bandwidth required for transmission of video. Where formerly a cable channel's bandwidth would support the transmission of a single analog video channel, with digital compression cable operators could operate at various points on the resolution/bandwidth trade off curve, allowing 12 video channels of average quality or 7-8 channels of superior quality to be transmitted in a bandwidth that formerly carried one analog channel of video. Video compression has also made HDTV possible: without it the bandwidth required for transmission could not be supported within the present allocations for bandwidth. Digital video is fundamental to the transmission of video using the Internet's packetized techniques. It allows the use of buffers to eliminate variations in a packet's time of arrival, and the application of even more powerful compression algorithms that further reduce the usage by the video signal of the channel's capacity (which in the Internet is shared by other users).
The pervasive use of digital video has spawned increased interest and demand for noise filtering algorithms. Noise reduction can be critical to overall system operation, since the presence of noise in video not only degrades its visual quality but affects subsequent signal processing tasks as well. Noise especially is deleterious to digital video that will be compressed and decompressed. The effect is inherent in compression algorithms. These algorithms are designed to recreate a sequence of images that will be perceived by the eye as being virtually identical to the images created from the uncompressed data. Since they do not reject noise, the compression algorithms treat it as signal, and attempt to create data that represents components of noise that will be most visible to the eye. Worse yet, in most instances the output of the video compression unit is limited in data rate to match it to the rated capacity of the channel through which the data is transmitted. When noise captures some of bits that are outputted by the video compressor, fewer bits are left to represent the real signal. Therefore noise reduction—the elimination, as far as possible, of noise contaminating the video—is a desirable adjunct to video compression.
Noise is a catch-all term for an unwanted signal that is interfering with the signal that is desired. It is noticeably present in television receivers situated in areas with having marginal signal conditions for receiving a conventional amplitude modulated vestigial sideband television signal. This noise is commonly modeled as being additive, white and Gaussian. In the case of analog video delivered by satellite, the video signal is frequency modulated onto a carrier. The baseband signal delivered by the ground receiver is accompanied by noise that is additive and Gaussian when the receiver is operating above threshold (i.e., when the vector representing the noise in signal space is usually much smaller than the vector representing the modulated signal). When the system is close to threshold, the character of the noise becomes impulsive, leading, for example, to the clicks that are heard on an automobile radio as the FM station being received goes out of range. For video transmitted by satellite, the impulses appear in the picture as short white or dark streaks. A satellite or terrestrial television receiver may also be affected by man-made noise such as impulsive noise originating from motor vehicles.
Applying noise reduction to video is the process of identifying the desired video signal and using that information to discriminate against the noise. Best performance is achieved by having a broad range of processing options that is available only through the use of digital techniques. The input video would be sampled into numerical pixel values indexed by horizontal and vertical spatial coordinates and a time coordinate that is an indicator of frame number. A filtering operation is modeled as a sequence of arithmetic operations performed on the input samples to form an output pixel.
Noise reduction inherently implies averaging together elements of the signal that are almost identical. Suppose a given pixel has a noise-free value of 0.5, meaning its brightness is half-way between peak white and black. The pixel is contaminated by noise n1, so the pixel value that is actually available is P1=0.5+n1. With additional knowledge, a second pixel may be found in another position with value P2=0.5+n2, where n1 and n2 are both noise values and are uncorrelated and have the same variance. The weighted average of 0.5 P1+0.5 P2 is found to be equal to 0.5+½(n1+n2). The power in ½(n1+n2) is one-half the power in n1 or n2. Thus, averaging together the values of the two pixels improves the signal/noise ratio of the estimated pixel value by a factor of 2. However, if P2=0.3+n2, meaning that the brightness of the second pixel was closer to black, then 0.5 P1+0.5 P2=0.4+½(n1+n2). The net effect of weighting P1 and P2 equally before averaging in the second case is to introduce an error into the estimate for the brightness of the pixel the weighted average is supposed to represent. This example illustrates the basic principle of this invention: to reduce the noise level associated with a particular pixel, weight average its value with a second pixel value whose noise free brightness is close to the one in question. When the confidence level in the equality of the noise free brightness levels is high, the weights assigned to the 2 pixel values should be approximately equal; if the confidence level is low, the second pixel level is effectively disregarded by making its weight close to zero, with the first pixel value weighted by (1−weight used for 2nd pixel).
In the current state of the art, the best noise rejection performance is achieved with three-dimensional filters that combine two-dimensional spatial filters and one-dimensional temporal filters to obtain the benefits of each. Actually the potential of unrestricted three dimensional filtering potentially goes beyond what has been achieved to date. While separating the three-dimensional sampling grid into a two-dimensional spatial grid combined with a one-dimensional temporal grid may make the processing of the signal more convenient because fewer processing operations per input data point may be required, the limitation of separating the filtering of video in time and space unduly restricts the available design options. Allowing unrestricted three-dimensional filtering would expand the design options to include more complex algorithms for video processing. A generalized approach is likely to offer superior performance for noise reduction with fewer deleterious side effects, such as blurring and apparent loss of resolution. Therefore, there is a need for a refinement of three-dimensional digital filtering into classes of algorithms that simplify the application of the algorithm while preserving the likelihood of optimal performance that three-dimensional filtering promises.