1. Field of the Invention
The present invention relates a method of rectifying a stereoscopic image pair, and in particular relates to a method of determining a pair of rectification transformations for rectifying the two captured images making up the image pair so an to substantially eliminate vertical disparity from the rectified image pair. The invention is particularly applicable to rectification of a stereoscopic image pair intended for display on a stereoscopic image display device for direct viewing by an observer. The invention also relates to an apparatus for rectifying a stereoscopic image pair.
2. Description of the Related Art
The principles of stereoscopic displays are well known. To create a stereoscopic display, two images are acquired using a stereoscopic image capture device that provides two image capture devices. One image capture device (known as the “left image capture device”) captures an image corresponding to the image that would be seen by the left eye of an observer, and the other image capture device (known as the “right image capture device”) captures an image corresponding to the image that would be seen by the right eye of an observer. The two images thus acquired are known as a pair of stereoscopic images, or stereoscopic image pair. When the two images are displayed using a suitable stereoscopic display device, a viewer perceives a three-dimensional image. The stereoscopic image capture device may contain two separate image capture devices, for example such as two cameras. Alternatively, the stereoscopic capture image device may contain a single image capture device that can act as both the left image capture device and the right image capture device. For example, a single image capture device, such as a camera, may be mounted on a slide bar so that it can be translated between a position in which it acts as a left image capture device and a position in which it acts as a right image capture device. As another example, the stereoscopic image capture device may contain a single image capture device and a moving mirror arrangement that allows the image capture device to act either as a left image capture device, or a right image capture device.
One problem with conventional stereoscopic displays as that stereoscopic images can be uncomfortable to view, even on high quality stereoscopic display devices. One cause of discomfort is the presence of vertical disparity within a stereoscopic image pair. Vertical disparity means that the image of an object in one of the stereoscopic images has a different vertical position than the image of the same object in the other stereoscopic image. Vertical disparity arises owing to many kinds of mis-alignment of the camera systems, and causes discomfort to a viewer. Image rectification is a process for eliminating vertical disparity between the two images of a stereoscopic image pair, so making the resultant stereoscopic image more comfortable to view.
The origin of vertical disparity within a stereoscopic image pair will now be explained with reference to a simplified model that uses a camera set up consisting of two pin-hole cameras, one for recording the image that would be seen by the left eye of the observer and the other for recording the image that would be seen by the right eye of an observer. The left pin-hole camera—that is, the pin-hole camera for recording the image that would be seen by the left-eye—consists of a pin-hole 1L and an imaging plane 2L, and the right pin-hole camera—that is, the pin-hole camera for recording the image that would be seen by the right eye—also comprises a pin-hole 1R and an imaging plane 2R.
In the two camera set-up of FIG. 1, the base line 3 is the distance between the pin-hole 1L of the left camera and the pin-hole 1R of the right camera. The optical axis of each camera is the axis that is perpendicular to the imaging plane of the camera and that passes through the pin-hole of the camera. For each camera, the “principal point” is the point 5L, 5R in the imaging plane 2L, 2R of the camera that is nearest to the pin-hole 1L, 1R of the camera. Finally, the effective focal length of each camera is the distance fL, fR between the pin-hole of a camera and the principal point of the camera.
FIGS. 2(a) and 2(b) illustrate an ideal stereoscopic recording set up. In an ideal set up, the left and right cameras are identical so that, inter alia, the focal length of the left camera is identical to the focal length of the right camera and the principal point of the left camera is identical to the principal point of the right camera. Furthermore, in an ideal camera set up the optical axis of the left and right cameras are parallel, and are also perpendicular to the base line. For brevity, a camera set up such as shown in FIG. 2(a) or 2(b) will be referred to as a “parallel camera set up”.
If a stereoscopic image pair is captured with two identical cameras, or other recording devices, arranged precisely in a parallel camera set up, vertical disparity will not occur between the two images of the stereoscopic image pair. However, vertical disparity is introduced into the stereoscopic image pair when the image pair is captured with a non-ideal camera set up. In practice, a typical low-cost stereoscopic camera system is only an approximation to a parallel camera set up. The two cameras in a typical low-cost stereoscopic camera system will in practice have unmatched focal lengths and unmatched principal points, even if the two cameras are nominally identical. Furthermore, the optical axes of the two cameras are likely not to be exactly orthogonal to the base line, and are likely not to be parallel to one another. Such a typical stereoscopic camera system is illustrated in FIG. 2(c). Stereoscopic images captured using a camera set up having the defects shown in FIG. 2(c) will contain vertical disparity.
The focal length and principal point are sometimes called the “intrinsic” camera parameters, since these parameters relate to a single camera. The rotation and translation are referred to as “extrinsic” camera parameters, since they relate to the way in which one camera of a stereo camera set up is aligned relative to the other camera.
It is known to process stereoscopic images captured using a non-parallel camera set up, in order to reduce vertical disparity. This process is known as “rectification”. If the rectification process is completely effective, vertical disparity will be eliminated—and a high quality stereoscopic display can be obtained even though the original images were captured using a non-parallel camera alignment. The rectification process can be thought of as a process for virtually aligning the two cameras, since the rectified images correspond to images that would have been acquired using a parallel camera set-up (assuming that the rectification process was carried out correctly).
FIG. 3(a) is a block flow diagram of a prior art rectification process. At step 11 a stereoscopic image pair is captured, and a correspondence detection step is then carried out at step 12 to detect pairs of corresponding points in the two images (that is, each pair consists of a point in one image and a corresponding point in the other image). If there is vertical disparity between the two images, this will become apparent during the correspondence detection step.
At step 13 details of the rectification procedure required to eliminate the vertical disparity between the two stereoscopic images are determined, from the results of the correspondence detection step. At step 14 a pair of rectifying transformations is determined, one transformation for rectifying the left image and one transformation for rectifying the right image. At step 15, the left and right images are operated on by the rectifying transformation determined for that image at step 14; this is generally known as the “warping step”, since the left and right images are warped by the rectifying transformations. The result of step 15 is to produce a rectified image pair at step 16. If the rectifying transformations have been chosen correctly, the rectified image pair should contain no vertical disparity. Finally, the rectified images can be displayed on a stereoscopic imaging device at step 17.
The rectifying transformations determined at step 14 will depend on the geometry of the camera set up. Once suitable rectifying transformations have been determined from one captured image pair, therefore, it is not necessary to repeat steps 12, 13 and 14 for subsequent image pairs acquired using the same camera set-up. Instead, a subsequent captured image pair acquired using the same camera set-up can be directly warped at step 15 using the rectifying transformations determined earlier.
Apart from the elimination of vertical disparity within a stereoscopic image pair, rectification is also used in the prior art to simplify subsequent stereoscopic analysis. In particular, the stereoscopic matching or correspondence problem is simplified from a two-dimensional search to a one-dimensional search. The rectifying transformations for the left and right images are chosen such that corresponding image features can be matched after rectification.
Prior art rectification techniques of the type shown generically in FIG. 3(a) fall into two main types. The first type of rectification process requires knowledge of the “camera parameters” of the camera setup. The camera parameters include, for example, the focal lengths of the two cameras, the base line, the principal point of each camera and the angle that the optical axis of each camera makes with the base line. Knowledge of the camera parameters is used to estimate appropriate rectifying transformations. FIG. 3(b) is a block flow diagram for such a prior art rectification process. It will be seen that the method of FIG. 3(b) differs from that of FIG. 3(a) in that knowledge of the camera parameters is used at step 13 to estimate the rectifying transformations.
Prior art rectification methods of the type shown schematically in FIG. 3(b) are disclosed in, for example, N. Ayache et al in “Rectification of images for binocular and trinocular stereovision” in “International Conference of Pattern Recognition” pp11–16 (1998), by P. Courtney et al in “A Hardware Architecture for Image Rectification and Ground Plane Obstacle Detection” in “International Conference on Pattern Recognition”, pp23–26 (1992), by S. Kang et al in “An Active Multibaseline Stereo System with Real-Time Image Acquisition” Tech. Rep. CMU-CS-94-167, School of Computer Science, Carnegie Mellon University(1994),and by A. Fusielloetal, “Rectification with Unconstrained Stereogeometry” in “Proceedings of British Machine Vision Conference” pp400–409 (1997).
Prior art rectification methods of the type shown schematically in FIG. 3(b) have the disadvantage in that they are only as reliable as the camera parameters used to estimate the rectifying transformations. In principle, if the exact camera parameters are used to estimate the rectifying transformations, then the vertical disparity can be completely eliminated. In practice, however, the camera parameters will not be known exactly and, in this case, the rectifying transformations will be chosen incorrectly. As a result, the rectified image pair will still contain vertical disparity.
An alternative prior art rectification method is illustrated schematically in FIG. 3(c). This method does not use the camera parameters to determine the appropriate rectifying transformations. Rectification that does not involve use of camera parameters is sometimes referred to as “projective rectification”.
In projective rectification, there are degrees of freedom in the choice of the rectifying transformations. Most prior art methods of projective rectification use some heuristics to eliminate these degrees of freedom so as to eliminate all but one pair of rectifying transformations; the one remaining pair of rectifying transformations are then used to rectify the left and right images. The heuristic minimises image distortion, as measured in some way, in the rectified image pair. This prior art method has the feature that the pair of rectifying transformations that is determined does not necessarily correspond to virtually aligning the cameras to give a parallel camera set up. Where the rectified image pair produced by the rectification process is intended for stereoscopic analysis such as stereoscopic correspondence, it is not necessary for the rectifying transformation to correspond to a virtual alignment that gives a parallel camera set-up. However, where the rectified stereoscopic image pair is to be viewed on a stereoscopic imaging device, it is desirable that the rectifying transformation does correspond to a virtual alignment that gives a parallel camera set-up since, if the rectifying transformation does not correspond to a virtual alignment that gives a parallel camera set-up, the perceived three-dimensional image could appear distorted from what would have been observed using a parallel camera set up. For example a rectifying transformation that transforms straight lines in a captured image into curved lines in the rectified image does not correspond to a virtual alignment that gives a parallel camera set-up.
U.S. Pat. No. 6,011,863 discloses a method of the general type shown in FIG. 3(c) in which an original captured image is projected onto a non-planar surface, so that straight lines in the captured image are transformed to curved lines in the rectified image. As noted above, this transformation does not correspond to a parallel camera alignment.
D. Papadimitriou et al disclose, in “Epipolar Line Estimation and Rectification for Stereoimage Pairs”, “IEEE Transaction of Image Processing”, Vol. 5, pp672–676 (1996) a rectification method in which the camera rotation is restricted to be about a particular axis only. With such a restricted camera geometry, all the camera intrinsic and extrinsic parameters can be estimated from the correspondence detection. The rectifying transformations can then be determined from the camera parameters. This method is limited to one specific camera geometry.
R. Hartley et al disclose, in “Computing matched-epipolar projections” in “Conference on Computer Vision and Pattern Recognition” pp549–555 (1993), a rectification method using the heuristic that (i) the rectifying transformation for one of the images is a rigid transformation at a specific point (typically the centre of the image) and (ii) the horizontal disparity is minimised. Similar heuristics are used in methods disclosed by R. Hartley in “Theory and Practice of Projective Rectification” in “International Journal of Computer Vision” (1998) and by F. Isgro et al in “Projective Rectification Without Epipolar Geometry” in “Conference on Computer vision and Pattern Recogniton” pp94–99 (1999).
These methods have the disadvantage that the rectifying transformations do not necessarily correspond to a virtual alignment to a parallel camera set-up.
C. Loop et al disclose in, “Computer Rectifying Harmographies for Stereo Vision” “Tech Rep MSR-TR-99-21, Microsoft Research (1999), a rectifying method that uses an heuristic that maintains the aspect ratio and perpendicularity of two lines formed by the mid points of the image boundaries. This rectifying transformations determined by this method again do not necessarily correspond to a virtual alignment to a parallel camera set-up.
Japanese patent Nos. 2058993 and 7050856 describe correcting a stereoscopic video signal to compensate for differences in brightness or colour balance between the left eye video signal and the right eye video signal. These documents do not relate to correcting for vertical disparity between the left eye image and the right eye image.
U.S. Pat. No. 6,191,809 describes correcting for optical misalignment of the two images of a stereoscopic image pair (for example produced by a stereo electronic endoscope). The citation discloses processing the image data electronically by digitising the two images, and digitally rectifying the images by means of a vertical image shift and/or image size change and/or image rotation in order to correct for any mis-alignment between the two images. However, no details of the rectifying transformations are given.
EP-A-1 100 048, which was published after the priority date of this application, describes a method of processing an image pair that includes an image rectification step. However, no details of the image rectification step are given.