1. Field of the Invention
The present invention relates generally to methods and systems for estimating Global Motions (GMs) in a video sequence, and more particularly, to methods and systems for estimating and compensating for GMs in a video sequence through a novel non-iterative motion estimation.
2. Background of the Invention
Utilization of a video camera or a digital still camera (DSC) to record a scene is well known in the art. The scene recorded by the video camera is formed of a video sequence that comprises a number of individual images, or frames, taken at regular intervals. When the intervals are sufficiently small, displaying the successive frames adequately recreates the motion of the recorded scene.
In general, the motion in the video sequence, or the differences between successive frames, is due to movements of an object being recorded or the motion of the camera itself, resulting from adjustments by the user to the camera functionalities, such as zooming, involuntary movements, or jitters. The motions caused by camera movements result in Global Motions (GMs) in the video sequence, meaning the entire scene shifts and moves, as opposed to a local motion, such as a movement by an object being recorded, against a steady background. Some GMs such as jitters are generally unintended and undesired during a recordation process. A number of systems and methods have been proposed to estimate and compensate for GMs.
It is known in the art that GMs in a video sequence are often modeled by parametric transforms of 2D images. The process of estimating the transform parameters from images is known as Global Motion Estimation (GME). GME is an important tool widely used in computer vision, video processing, and other related fields. As an example, for MPEG-4 GME, global motions are described in a parametric form, with models ranging from a simple translational model with two parameters to a general perspective model with eight parameters. Among these models, the model with eight parameters is the most general in MPEG-4 GME. According to this model, the GM between a reference frame and a current frame can be represented by coordinates (x, y) that is calculated by the following equations:
            x      ′        =                                        m            0                    ⁢          x                +                              m            1                    ⁢          y                +                  m          2                                                  m            6                    ⁢          x                +                              m            7                    ⁢          y                +        1              ,          ⁢            y      ′        =                                        m            3                    ⁢          x                +                              m            4                    ⁢          y                +                  m          5                                                  m            6                    ⁢          x                +                              m            7                    ⁢          y                +        1            
GMs can only be calculated by finding all eight parameters, m0˜m7, of the frames. Many algorithms have been proposed for MPEG-4 GME, both in the pixel-domain and in the compressed-domain. Most of the algorithms dealing with the perspective model, however, are iterative because the perspective transform model is nonlinear with respect to the GM parameters. Although acceptable performance can be achieved through the iterative approach, the computational cost may be prohibitive for real-time encoding or for applications with limited computational power such as those in wireless devices.
Furthermore, the conventional GME algorithm is considered as the most time consuming and cost ineffective operation in modern MPEG-4 Advanced Simple Profile (ASP) video coding. As computational cost is the major concern for some applications involving GME, it is desirable to design an algorithm with less computational complexities.