In long distance imaging applications, such as long distance surveillance, a captured video can appear blurry, geometrically distorted, and unstable due to camera movement, atmospheric turbulence, or other disturbance.
Typically, atmospheric turbulence is the main reason why geometric distortion and blur exist in the captured videos. Long distance surveillance over water or hot surface is particularly challenging as the refractive index along the imaging path varies greatly and randomly. The lens quality and the sensor size usually have less impact on the resolution of long distance imaging.
Atmospheric turbulence is mainly due to fluctuation in the refractive index of atmosphere. The refractive index variation of the atmosphere involves many factors including wind velocity, temperature gradients, and elevation.
Light in a narrow spectral band approaching the atmosphere from a distant light source, such as a star, is well modelled by a plane wave. The planar nature of this wave remains unchanged as long as the wave propagates through free space, which has a uniform index of refraction. The atmosphere, however, contains a multitude of randomly distributed regions of uniform index of refraction, referred to as turbulent eddies. The index of refraction varies from eddy to eddy. As a result, the light wave that travels in the atmosphere from a faraway scene is no longer planar by the time the light wave reaches the camera. FIG. 1 illustrates the effect of Earth's atmosphere on the wavefront of a distant point source. In FIG. 1, after the plane wave passes through a turbulent layer in the atmosphere, its wavefront becomes perturbed. Excursions of this wave from a plane wave are manifested as random aberrations in imaging systems. The general effects of optical aberrations include broadening of the point spread function and lower resolution. Although some blurring effects can be corrected by fixed optics in the design of the lens, the spatially random and temporally varying nature of atmospheric turbulence makes it difficult to correct for.
Traditionally in long distance imaging, multiple frames (typically 10-100 frames) are needed to remove the turbulence effect. For example, Lou et al., “Video Stabilization of Atmospheric Turbulence Distortion,” Inverse Problems and Imaging, vol. 7, no. 3, pp. 839-861, 2013″ use a spatial and temporal diffusion method to reduce geometric distortion in each captured frame and stabilize the video across frames at the same time. Other methods, such as the bispectrum method by Carrano et al. J. Brase, “Adapting high-resolution speckle imaging to moving targets and platforms”, SPIE Defense and Security Symposium, Orlando, April, 2004”, try to extract the long exposure point spread function (PSF) of the atmospheric turbulence from a large number of frames and apply the PSF to deblur each frame.
In a typical long distance surveillance situation, however, because the region of interest is often around a moving object, such as a person, a vehicle or a vessel, multiple frame based turbulence correction methods have problems with the blurred moving object. This defeats the purpose of video surveillance as details on the moving object are often the goal of surveillance.
One solution is to detect and extract the moving object in the video as a foreground. This enables the still background and the moving foreground to be processed separately. While this method works reasonably well in many short distance video surveillance applications, it still faces the quasi-periodic disturbance from atmospheric turbulence. In other words, due to turbulence effect in the captured frames, background extraction becomes unreliable. In particular, due to the geometric distortion and blurring caused by turbulence, still background objects such as mountains and roads appear to be moving, which cause many false positive errors in moving object detection.
Other methods avoid problematic regions in the frame by monitoring certain features such as SURF (speeded up robust features) features and choosing regions in the frame with dense SURF features to perform rigid frame registration. However, because the geometric distortion caused by atmospheric turbulence is random and local, rigid frame registration does not correct turbulence effect. Furthermore, SURF features are not consistent in a blurred and distorted video frame with turbulence.
Recently, new methods correcting for false positive and false negative error in moving object detection are proposed where convolutional neural network (CNN) is used to perform semantic segmentation. That is, regions in each frame are classified semantically as people, car, boat, bicycle, and the like. Because video surveillance applications often have a clearly defined monitoring task, the foreground that gets detected usually belongs to a very limited collection of objects. In this limited application, one can simply decide that if a blob in the frame is not semantically understood as part of a group of objects of interest, it can be considered a background. For example, if in a video surveillance task, vehicles are the main target, any objects that are not classified by the CNN as vehicles will be treated as background, even if they were classified as foreground by the base background detection algorithm.
The limitation of the above method using CNN is two-fold: first, it requires a clearly defined class of objects and there is no capacity in the system to monitor unexpected objects; furthermore, this requires a large amount of training data, training time and extensive computation power.