The present invention relates to the field of global motion compensation in video imagery. More specifically, in one embodiment the invention provides a method and apparatus for compensating for global motion in a video scene.
Recording a scene with a video camera is known in the art. The camera records many individual images, or frames, taken at regular intervals. Motion in the recorded scene translates to differences in the still images of successive frames, and if the intervals are small enough, displaying the succession of frames will recreate the motion of the recorded scene. The motion in the scene, or the difference between successive frames, is due to either real motion of a physical object in the scene being recorded, or apparent motion caused by camera movement or zoom adjustment. Because apparent motion is caused by camera movements, it generally results in global motion, where the entire scene shifts, as opposed to local motion due to a small part of the scene moving against a steady background.
Often, the amount and direction of the apparent movement is of interest. For example, since apparent motion is caused by camera movements, the unwanted jittering in a recording due to camera movements can be eliminated if the apparent motion is known, since an equal and opposite apparent motion can be added to eliminate the jitter.
Another use for apparent motion information is in image compression, since the amount of data needed to represent a slowly moving or steadily recorded scene is less than the amount needed for a fast moving scene. For example, a static scene of many frames can be compressed by merely eliminating all but one frame. The recording can be decompressed by copying the one remaining frame to reconstruct all the eliminated frames since they are all identical. In a moving scene, where the recording comprises frames which are different from one another, compression is more difficult, however not impossible.
Compression of data depends on the complexity of the data. If the data contains duplicate information, the duplicate data can be replaced with a much smaller data indicating where the duplicated data is to be found. Similarly, if the data, such as image data, contains simple patterns which can be described in less data than the image itself, the data can be compressed. Thus, compression is greater for simpler images. With motion picture image data, high compression is possible by replacing all by an initial full image with difference frames. A difference frame is a image comprising the differences between successive frames. Since the initial frame is not replaced, all the other frames can be recreated by successively adding the difference frames to the initial frame. Of course, if nothing in the view of a camera capturing the motion picture is moving, there are no interframe differences, and the difference frames are all zero. Maximum compression is available with zero differences, since only the initial frame and data indicating the lack of motion need be stored.
While it is rarely possible to record only static scenes, some scenes can be rendered fairly static by removing the global motion from frame to frame before taking difference frames. Thus, to improve the compressibility of a recording, global motion is removed from the data by eliminating the global motion and storing the parameters of the global motion such as direction and speed for use during decompression, thereby simplifying the difference frames.
For these reasons, knowledge of camera movement is desirable. One known method for determining camera movement is to digitally process recorded frames and determine the camera movement based on the global motion detected between recorded frames. This requires substantial computing power and will result in inaccurate estimations of camera movement where a large, moving object dominates the scene. Although global motion calculations relative to a moving object might be advantageous for compression, typically, global motion is calculated relative to a stationary background.
Furthermore, the amount of computing power available in a camera is limited by space, electrical power, and the need for real-time processing. Consequently, any means to simplify the determination of camera movement would be an needed improvement.
Many cameras offer global motion compensation, in the form of image stabilization. Typically stabilization of the image is accomplished through gyroscopes which prevent the camera from rotating, or motion sensor-mirror combinations where the camera is free to move, but a movable mirror is placed between the recording sensor, usually a charge coupled device (CCD) and the lens of the camera. The mirror is moved in the "opposite" direction as that detected by the motion detector to cancel out global motion.
Such systems require a means for distinguishing undesirable high-frequency motion, such as the hand vibrations of a person holding the camera or an automobile to which the camera is mounted, from desirable low-frequency motion such as panning. Also, with such compensation systems, the correction must be either imposed at the time of image capture, or it is lost. If the motion signals are used, they are used to move a mirror placed before a recording sensor such as a CCD, thus the image recorded is one where the motion has been restrained by the action of the compensating mirror.
If the motion signals are not used, the motion information is discarded and is not available for later processing. Thus, sensor-mirror systems can only be used to stabilize a scene. While stabilization is convenient for a typical video hobbyist, stabilization does not address the problem of video compression when low frequency motion is present in a scene. Furthermore, the automatic correction of a gyroscopic stabilizer or sensor-mirror system reduces the amount of control a user has over the recording, and even where the camera allows a choice between accurate recording and stabilized recording, as the recording is made, the choice is permanent. Thus, an apparatus is needed to allow for both a stable, but processed, playback and a jittery, but accurate, playback.
In a camera with a sensor-mirror system, the resulting recording is not, by itself, optimized for compression as a digitally-compensating camera would produce. In a sensor-mirror or gyroscope camera, low frequency apparent motion must be ignored, to allow the user to pan the camera. Because this motion is ignored, it appears in the recorded frames, and the only way to remove the apparent motion before compression is to compute the low frequency global motion from the frame data.
Because the sensor-mirror and gyroscopic compensation methods alter what is recorded, these methods cannot be used to compensate for camera zoom, since to do so would totally cancel the effect of the zoom.
From the above it is seen that an improved method for global motion compensation in recorded scenes is needed.