The display of 3D imagery can be created from a number of views of a camera subject taken from different angles. Often the captured images are from a limited number of camera positions so it may be necessary to interpolate between the capture images to give the impression of true 3D across the range of viewing angles.
Traditional view interpolation uses a single pair of cameras with a short baseline. One common application for view interpolation is in gaze control for tele-conferencing systems with a camera setup such as that illustrated in FIG. 2a. FIGS. 15(a), 15(b) and 15(c) show the interpolated results of such a camera setup. FIG. 15(a) is the view taken from a right webcam and FIG. 15(c) is the view taken from a left webcam. FIG. 15(b) is the interpolated view.
In view interpolation applications, normally a rectification process is required to make use of the epi-polar constraint in the stereo matching process. The rectification can be done based on a one-time calibration when the positions of the cameras' position are fixed, or with some image processing method (e.g. feature point detection and matching) when calibration is not feasible such as described in R. I. Hartley “Theory and Practice of Projective Rectification”. Int Journal of Computer Vision 35: 115-127. 1999. The purpose of rectification is to transform the two input images onto a common image plane where the matching is constraint within the same line between two input images as illustrated in FIG. 14.
Matching and interpolation is then consequently done in this common image plane after the rectification process and the result can be directly shown as the output. No post processing may be necessary for such a setup as shown in FIG. 2a. 
View interpolation can also be applied to the output from a camera array with more than two cameras to cover larger viewing zones. FIG. 2b shows two examples of camera arrays with more than two cameras. In these two setups, all the cameras are parallel and the image planes can be considered to coincide with each other. Thus post processing may not be necessary. If however the image planes of all the cameras do not coincide, then post processing may become necessary.
In setups with more than two cameras and where the image planes do not coincide, a problem arises when the interpolated views move across the boundary of each pair, as is illustrated in FIG. 3. This is because rectification can only be done between a pair of cameras. For such cases, post processing is needed to create a continuous viewing effect for the full set of interpolated results from different camera pairs.
Without post processing, pair-to-pair jumping effects exist at the boundaries illustrated in FIG. 3. FIG. 10 shows the abrupt change between one image to the next at the boundary. When viewed on a 3D display, this discontinuity effect in the interpolated results may cause unnatural abrupt changes when the viewer is moving around the display.
Another unnatural effect of the interpolated results may be the incorrect keystone when viewed around the RayModeller image display system 170 that is illustrated in FIG. 1. Again, this is caused by the fact that all the interpolated view images are on the same rectified image plane and the keystone is incorrect for different viewing angles.
In Dirk Farin, Yannick Morvan, Peter H. N. de With, “View Interpolation Along a Chain of Weakly Calibrated Cameras” IEEE Workshop on Content Generation and Coding for 3D-Television, June 2006, Eindhoven, Netherlands post processing calling “un-rectification” is proposed. The idea is to undo the “rectification” for each interpolated views to generate physically valid viewing effect. At the borders of camera pairs, the “un-rectified” results coincide with the original camera images thus the discontinuity problem is solved.
Transformations between the original images and the rectified images for the extreme views are known from the image-rectification process. However, the transformation for the interpolation results between these transformations is not available. Hence, interpolation between the two transformations for the two extreme views might be done in order to obtain a visually sensible motion. Simple interpolation of the transformation matrices H(i)b, H(i+1)a may lead to unnatural or even invalid transformations (mapping part of the image to infinity). As an alternative approach, the motion of the four corners of the rectified images may be used as references. The positions of these four corners are linearly interpolated and the intermediate transformation Hi(i+1)(v) is determined as the transformation that maps these four corners to the screen corners.
Although this alternative method can generate visually continuous results, it may not be accurate and it may be complex. The assumption of linear movement of four corners may be invalid generally and the error may become visible when the change in viewing position and direction between the two cameras is big. Also, the algorithm may be complex as user input of scene corners is needed and additional estimation step based on the four corner positions is needed.