Computer vision systems often comprise stereo camera systems for capturing stereo images that are subsequently processed in a digital way. Such camera systems are to be calibrated so that the image data is the correct input for further processing. The purpose of camera calibration is to determine the transformation from the three-dimensional (3D) world coordinates to the two-dimensional (2D) image coordinates. Once this transformation is known, 3D information about the real world can be obtained from the 2D information in the image or images.
In “Camera Self-Calibration: Theory and Experiments”, Proc. Second European Conf. Computer Vision, pp. 321-334, May 1992, O. Faugeras, T. Luong and S. Maybank describe a calibration method that requires only point matches from image sequences. The authors state that it is possible to calibrate a camera just by pointing it at the environment, selecting points of interest and then tracking them in the image as the camera moves. It is not necessary to know the movement of the camera. Their camera calibration is computed in two steps. In the first step, the epipolar transformation is found. In the second step, the epipolar transformation is linked to the image of the absolute cone, using the Kruppa equations.
In “Maintaining Stereo Calibration by Tracking Image Points”, 1993 CVPR, New York, June 1993, J. Crawley, P. Bobet and C. Schmid address the problem in active 3D vision of updating the camera calibration matrix as the focus, aperture, zoom or vergence angle of the cameras changes dynamically. The authors present a technique of computing the projection matrix from five and a half points in a scene without matrix inversion. They then present a technique of correcting the projective transformation matrix by tracking reference points. The experiments show that a change of focus can be corrected by an affine transform obtained by tracking three points. For a change in camera vergence, a projective correction based on tracking four image points is slightly more precise than an affine correction matrix. They also show how stereo reconstruction makes it possible to “hop” a reference frame from one object to another. Any set of four non-coplanar points in the scene may define such a reference frame. They show how to keep the reference frame locked onto a set of four points as a stereo head is translated or rotated. These techniques make it possible to reconstruct the shape of an object in its intrinsic coordinates without having to match new observations to a partially reconstructed description.
In “Online Stereo Camera Calibration”, Tina Memo No. 1992-002, updated Jun. 9, 2005, N. Thacker and P. Courtney argue that successful implementation of stereo vision in a real-world application will require a self-tuning system. They describe a statistical framework for the combination of many sources of information for calibrating a stereo camera system, which results in a calibration system that would allow continual recalibration during normal use of the cameras. The authors demonstrate this with the use of measured 3D objects, accurate robot motion and matched stereo features in a working vision system. More specifically, Thacker et al. calibrate a stereo camera system through an iterative estimation procedure for the transformation and the camera parameters using feature points for equivalence matching.