The perception of depth in a stereoscopic 3D movie is created by viewing left and right images in which objects at different depths are displayed with different horizontal offsets. This offset is called image disparity or pixel parallax. The perceived depth of an object is determined by its parallax angle, which is the angular difference of a 3D point's projection onto the left and right retinas. The parallax angle of an object viewed in a stereoscopic image sequence is determined by the image disparity, display screen size, the location of the viewer, and the characteristics of the viewer's visual system, in particular the viewer's inter-ocular distance. The principles governing the relationship between the depth perceived by a viewer, image disparity, and the other relevant parameters are well known in the art, and are described, for example, in “3D Movie Making” by Bernard Mendiburu, published by Focal Press of Burlington, Mass., which is wholly incorporated herein by reference.
During the content authoring process, image disparities may be computed by an algorithm which operates on left-right stereo image pairs. Disparity is represented as a signed quantity, which by convention is zero for objects on the convergence surface, positive for objects beyond it, and negative for objects in front of it. Disparity values in raw footage are determined by the 3D location of an object, the configuration of the cameras, and the camera parameters, including convergence angle, focal length, inter-axial distance, sensor size, and image resolution.
A key aspect of disparity computations is finding the same point in the left image and in the right image, known as solving the correspondence problem. Basic algorithms use correlation to locate similar looking patches in an image pair. In the general case, corresponding image points in the left and right images have both a horizontal and a vertical offset. Determination of the disparities for an image yields a disparity map that is a 2D vector field. In fact, two sets of disparity maps can be computed: one for left-to-right comparisons, and one for right-to-left comparisons, which together yield a map having four values per pixel.
Many stereo algorithms reduce the disparity map to one value per pixel. In one approach, this is achieved by first rectifying the two images such that their scan lines align with epipoloar lines, then by searching in along each scan line to calculate a one dimensional disparity value for each pixel. In another approach, when the two images are close to rectified, the vertical component of the disparity can be computed and ignored, since in that case the horizontal component alone may be a good measure of disparity.
To solve the problem of multiple matches, various constraints are employed. One constraint preserves point ordering, since it can be expected that when moving along a scan line object points in left and right images will be ordered in the same manner, unless they are occluded, in which case they are missing. A smoothness constraint may be employed, which relies on the fact that in most parts of an image, disparity does not change suddenly unless moving across a depth edge.
Directors are increasingly shooting with more than two cameras. The additional cameras can provide views of regions that are occluded in other camera views, and the extra information they provide makes the depth calculations more robust. In the general case, data from all the cameras are used to determine correspondence between 3D points in the world and their projection into each image. This information can be used to derive disparity between any two of the cameras, which can then be used during the editing process to correct, modify or regenerate a left-right image pair for stereoscopic viewing.