Vertical misalignments occurring in multi-view or stereo images are mainly caused by improperly adjusted cameras and/or by lens distortions. This can happen when the optical axes of the cameras are not exactly parallel, the image sensors are not exactly coplanar, or the lens characteristics deviate from an ideal pin-hole camera.
When working in a very controlled environment, these errors can be avoided by calibrating the camera setup. The simplest solution is based on placing calibration patterns in front of the cameras and by applying an algorithm to determine the intrinsic and extrinsic camera parameters as well as the lens distortion parameters. Knowing all these parameters, it is possible to compensate improperly adjusted cameras and lens distortions by a process called rectification. A description of such a rectification process is given is A. Fusiello et al.: “A compact algorithm for rectification of stereo pairs”, Mach. Vis. Appl. Vol. 12 (2000), pp. 16-22.
When something is changed about the camera setup, e.g. when the zoom or even just the focus is changed by the camera operator, or when a non-rigid camera setup is moved from one place to another, the calibration parameters will become invalid. As a consequence it is almost impossible to apply a correct compensation by means of pre-determined camera calibration parameters when shooting real-life footage.
As a further complication, exact rectification is only possible for stereo image pairs. For multi-view images, exact rectification is restricted to objects located on a given plane floating in 3D space in front of the cameras. Although the location of the plane can be chosen freely, objects not located on the given plane can only be approximately rectified.
Research has been done to find methods for estimating camera parameters and lens parameters on-the-fly during shooting.
These approaches are typically based on feature point trackers. It is intuitively clear that it is not always possible to distinguish motion of objects in front of the camera from camera motion or lens modifications. See, for example, M. Pollefeys et al.: “Some Geometric Insight in Self-Calibration and Critical-Motion-Sequences”, Technical Report Nr. KUL/ESAT/PSI/0001, Katholieke Universiteit Leuven, 2000.
Vertical misalignments are a serious problem in stereo or multi-view content. They can be corrected by the brain to some extent, but watching misaligned content over an extended period of time can cause fatigue, eye strain or even nausea. It has thus to be ensured that vertical misaligned content is not delivered to the consumer. Estimating the amount of vertical misalignment should, therefore, be part of the analysis being done when offering a 3D certification service.
Apart from the above issues, vertical misalignments also cause difficulties for disparity estimators. See, for example, H. Hirschmüller et al.: “Stereo Matching in the Presence of Sub-Pixel Calibration Errors”, IEEE Conf. .Comp. Vis. Patt. Recog. (2009), pp. 437-444. Disparity estimators typically rely on the epipolar constraint, which on one hand reduces the disparity search space (leading to lower computational complexity), but on the other hand also constrains the solutions to those that are geometrically sound. The epipolar constraint is typically incorporated by restricting the disparity search to a search along horizontal scan lines, assuming that the horizontal scan lines coincide with the epipolar lines. Any vertical misalignment will cause the epipolar lines to deviate from the horizontal scan lines. As a consequence, searching along horizontal lines will cause wrong disparity estimation results.
Except for live broadcast scenarios, problems caused by improperly positioned cameras or by lens distortions can be corrected in post-production. However, in practice it is often necessary to deal with content before it is being corrected.
Consequently, there is a need for disparity estimation methods that are robust with respect to vertical misalignments, camera miscalibrations and/or lens distortions. There is also a need for a method to determine the amount of misalignments or distortion, especially a vertical component of a distortion field.