Video/image processing in a low light environment introduces a tension or tradeoff between noise and motion. The low number of photons in the low light environment produces noisy images. Increasing the exposure time of a camera to allow collection of more photons reduces noise, but large exposure time values could increase motion blur due to potential object motion within the exposure period. Motion blur represents poor temporal resolution. That is, in images or videos with motion blur, scene changes over a small time interval are not resolved. Several solutions have been proposed to reduce the noise of the video images in a low light environment. For example 3-dimensional (3D) filtering methods filter the video both in space (2D) and time (1D). These methods attempt to decrease noise without increasing motion blur.
Furthermore, 3D filtering methods are expensive since they are computationally intensive, require large memory space and bandwidth, and introduce artifacts. Typically, in these methods, patch-based processing is required to compensate for motion. A patch refers to one or more pixels around a pixel of interest. To filter a pixel in a current frame, a patch around the pixel is selected and processed with similar patches in the current or other frames in the video. Such patch-based processing methods reduce noise due to averaging of multiple noisy pixels. Unlike reducing noise through increasing exposure time, patch-based processing does not increase motion blur due to appropriate selection of similar patches. A major drawback of such patch-based processing is that such processing is computationally extensive and expensive. Moreover, since 3D filtering methods compute a denoised image from multiple image frames, they require large memory size and high bandwidth.
Also, since at low light, noise and motion are similar, separating motion from noise leads to unstable separation in low-contrast regions of the image. Typically, 3D filtering methods use thresholds for separating noise from motion. When an object moves in a low-contrast background, such threshold based separation of motion from noise would create undesirable artifacts in video due to the similarity of motion and noise. Additionally, temporal filtering degrades in performance when performed after typical processing steps in a video camera. For example, after mixing colors from the neighboring pixels, i.e., demosaicking, a frame's data size is typically three times the original size leading to additional complexity. Steps such as defect correction, white balancing, color correction, gamma correction, and video compression could alter pixel values differently in different frames leading to suboptimal temporal filtering performance.
In addition to temporal filtering, spatial filtering could also be used to reduce noise. Typical methods of spatial filtering involve averaging a noisy pixel with its surrounding pixels. While such filtering reduces noise in smooth regions of an image, it suffers from blurring out edges of the image. Edge blur represents poor spatial resolution. That is, in images or videos with edge blur, small details in the scene are not resolved. Traditional methods, such as bilinear filtering, preserve edges during denoising by performing computationally complex operations such as comparing the current pixel value with other pixel values in the neighborhood.
Accordingly, there is a need for an enhanced video image processing technique that decreases noise while minimizing motion blur and edge blur, without requiring a complex architecture, large memory, and/or high bandwidth.