1. Field of the Invention
The present invention relates to methods for stabilizing vibrating video images, and more particularly to a method for stabilizing vibrating video images utilized in a Digital Video Recorder (DVR).
2. Description of the Prior Art
Development of image and video processing has spanned many years. Image and video processing techniques, including segmentation, tracking, counting, image enhancement, and image registration, are widely adopted in both civilian and military applications. Video is a very important source of information for multimedia applications, however most video processing techniques direct their processing at already captured, stable images, without regard for influence of vibrating images. For example, instability in a picture is very uncomfortable on human vision of a user. Tracking, counting, segmentation and other techniques utilized in video processing are influenced by vibrations, which cause error and/or loss of data, and related algorithms even cause erroneous judgments in later computer vision processing, possibly even leading to a serious crash of the entire image system. Thus, vibrating image removal is very important, and primarily focuses on removing vibrating between consecutive frames, converting an unstable image into an image more stable for human vision.
A Digital Video Recorder System (DVR System) is already widely applied in environmental safety, traffic monitoring, and crime prevention systems. Government agencies, schools, train stations, airports, military installments, roads, and historic sites all employ some form of monitoring recording system. However, digital monitoring recording systems are susceptible to influence by external forces (internal vehicle vibration, vibration from passing vehicles, shaking due to wind, and human vibration), which cause instability in the video image, reducing video data compression ratio, and influencing video recognition processing of the DVR System. Thus, maintaining stability in video images is an important issue for the DVR System.
Digital monitoring recorders may be affect by external force and generate unstable monitoring recording images. In a system, video image instability leads to calculation errors in information, increasing noise, thereby causing video image blurring, video compression ratio reduction, background determination errors, and object removal segmentation failures, which causes erroneous judgment by the computer vision system. If the monitoring recording system is utilized in a critical monitoring application, or a safety monitoring application, e.g. a vehicle monitoring system, a traffic flow monitoring system, a financial monitoring system, disaster detection, risk prevention detection, or a military application, the erroneous judgment may lead to loss of property, or even loss of life.
Not only does maintaining stability in video images aid the digital monitoring recording system, it is also important to many computer vision applications, such as remote image transmission, robot guidance, ego-motion recovery, scene modeling, video compression, and object motion detection. To keep the video image stable, two important factors are considered: frame vibration vector detection, and frame vibration vector correction.
Methods for performing frame vibration vector detection and correction have been proposed. The following are related contributions in the literature:    [1] Z. Duric and A. Rosenfeld, “Stabilization of image sequences”, University of Maryland, Tech. Rep. CAR-TR-778, July 1995.    [2] M. Hansen, P. Anadan, K. Dana, G. van de Wal, and P. Burt, “Real-time scene stabilization and mosaic construction,” in Proc. IEEE Computer Vision and Pattern Recognition Conf., pp. 54-62, 1994    [3] Horn, B. K. P. and Schunck, B. G., “Determining optical flow”, Artificial Intelligence, vol 17, pp 185-203, 1981.    [4] Lucas B D and Kanade T, “An iterative image registration technique with an application to stereo vision”, Proceedings of Imaging understanding workshop, pp 121-129, 1981.    [5] Huang-Chang Chang, Shang-Hong Lai, Kuang-Rong Lu, “A robust real-time video stabilization algorithm”, Journal of Visual Communication and Image Representation Volume: 17, Issue: 3, June, pp. 659-673, 2006.    [6] C. Morimoto and R. Chellappa, “Fast electronic digital image stabilization”, Univ. Maryland, Tech. Rep. of CVL, 1996.
Detection of frame vibration vector in an image primarily depends on which vibration model is chosen, and which vibration mode is utilized to calculate the frame vibration vector. Reference [1] utilizes the human eye's greater sensitivity to rotational vibration, and requires no extra information, to compensate for vibration. Reference [2] estimates the frame vibration vector, and defines an Affine Model, however when the frame vibration vector is too large, the method generates a very large error. References [1] and [3] utilize a singular line (defined as an infinite straight line passing through two singularity points) to detect the frame vibration vector, based on the theory that the singular line is not affected by slight rotation of the image, and assuming that singular lines for outdoor scenes have large gradient variation. However, validity of this assumption is limited in that the singular line must be longer than average length of the image. Thus, three possible conditions may occur that increase likelihood of error: the singular line is unclear, blurriness makes it impossible to determine, or the singular line obtained has length shorter than average length of the image.
Frame vibration vector correction keeps necessary pixels of a vibrating image, and removes unnecessary information of the vibrating image. First, a reference image is acquired, and following images and reference images are sequenced. Then, relative reference points of the images are compared to obtain a frame vibration vector between the images. An image restored based on the frame vibration vector is a stable image. The method is applicable when the frame vibration vector is not large, and the scene sampled by a lens must be fixed. Thus, when the vibration model corresponds to repeated, intense vibration or translational vibration, because internal calculation variables accumulate, exceeding real frame vibration vectors by too much, an erroneous result is generated. Reference [1], in study of image vibration caused by translational sampling, assumes a videographer intentionally makes a smooth motion, so that a stable image may be obtained by adjusting a vibration curve in each time interval.
The result is stable for consecutive images in each time interval. However, motion speed of each following point varies greatly for each time interval. Thus, in order to make video frames appear smooth, the method must insert a delay of a few image frames between each time interval. Reference [3] proposes a moving camera motion model. This model separates frame vibration vector of an image background and frame vibration vector of a moving object. The method is very effective for a known camera videographer motion variable, and a very stable camera setup. If the camera is installed on a moving object, such as a vehicle or aircraft, for filming, motion variables include: rate of motion of the moving object, moving object suspension system characteristic variables, relative distance between foreground and background, and environmental gravity. Thus, in practical situations, references [1], [2], and [3] have a large margin for improvement.
References [3] and [4] provide an optical flow for obtaining a motion vector of a characteristic object. Reference [3] utilizes a relationship between spatial and temporal domains of video, using iterative recursion to obtain a motion vector of a characteristic object. Reference [4] segments original video into smaller regions, and assumes that each smaller region has its own motion vector, using weighted least-square fitting to obtain optical flow value. For a fixed camera over an extremely short period of time, the optical flow value obtained through the optical flow method is the motion vector. However, the optical flow method has the following weaknesses: (1) It is not suitable for vibrations in an optical zoom environment; (2) A light source changes during dynamic filming; (3) Calculations take too long, and cannot be applied real-time. Although reference [5] proposes a homogenous region to reduce number of calculations, the method is still too slow for real-time systems.
Because video image stabilization is a pre-processing function among all digital monitoring recording system functions, eliminating video image vibration in real-time is a basic requirement. Research on real-time processing of video vibration is being performed in various fields, including military applications, communications organizations, and even image stabilization for handheld camcorders sold in the marketplace. Reference [2] realizes a system capable of stabilizing video images called a parallel pyramid system (VFE-100), which estimates consecutive vibration variables through a multi-resolution method of gradual approximation. The method primarily utilizes error of moving object momentum corresponding to a set affine motion model to derive a motion variable. If a video image requires compensation, a stored, verified input image is used to compensate the vibrating image, so as to output a stable result. The system is capable of processing at 10 frames/sec (fps) for 128×128 pixel images having a frame vibration vector within ±32 pixels. Reference [6] utilizes the same method to realize a Datacube Max Video 200, also called a PIPE (parallel pipeline image processing machine). An experimental sample thereof utilizes 128×120 pixel images, and achieves an experimental result of 10.5 fps, allowing the method to achieve a stabilization effect for video images having a frame vibration vector within ±35 pixels.
The above methods fail to provide an effective stabilization method that can be applied to real-time video stabilization processing.