Estimating the 3D orientation of a camera in a video sequence within a global frame of reference is a problem that may occur when addressing video stabilization in a virtual three-dimensional (3D) environment, as well as in navigation and other applications. This task requires the input of one or more orientation sensors (e.g., gyroscope, accelerometer, and/or compass) that may be attached to the camera to provide 3D orientation in a geographical frame of reference. However, high-frequency noise in the sensor readings may make it difficult to achieve the accurate orientation estimates that are required for visually stable presentation of a video sequence. This may be particularly true when the video is acquired with the camera as it undergoes high frequency orientation changes (i.e., jitter). Examples may include, for example, video shot from a moving car or while walking. Moreover, the quality of an orientation sensor can be a common problem in such contexts, especially for the low cost sensors available in consumer grade and cellphone cameras, leading to poor accuracy, especially in dynamic conditions. Typical values for angular root mean square (RMS) error may range from 0.5 to more than 2 degrees. Therefore such sensors may not measure camera jitter accurately, resulting in video sequences that may not show a stable scene when displayed in the context of a 3D environment.
On the other hand, image-based alignment has proven to be somewhat successful for image stabilization, providing accurate frame-to-frame orientation estimates. But image-based alignment may be prone to drifting over time due to error and bias accumulation and the lack of absolute orientation.
In the drawings, the leftmost digit(s) of a reference number identifies the drawing in which the reference number first appears.