The problem of recovering depth information using images captured simultaneously from multiple viewpoints has been studied extensively. In recent years, with advances in computing and imaging technologies, capturing multiple synchronized high quality video streams has become easier and the problem of recovering depth maps of dynamic scenes using synchronized capture has received increasing attention. In the materials that follow, this problem is referred to as dynamic depth recovery. Dynamic depth recovery may be considered as an extension of the traditional stereo computation problem where the depth solution desirably makes images consistent not only across multiple views, but also across different time instants.
A straightforward approach for dynamic depth recovery is to apply a standard stereo estimation algorithm at each time instant. A comprehensive survey on early stereo algorithms can be found in an article by U. R. Dhond et al., entitled “Structure From Stereo: a Review,” IEEE Transactions on System, Man and Cybernetics, vol. 19, no. 6, pp. 1489-1510, 1989. One new approach for finding depth information from two image sequences is described in an article by H. Tao et al. entitled “Global Matching Criterion and Color Segmentation Based Stereo,” Proc. Workshop on the Application of Computer Vision (WACV2000), pp. 246-253 December 2000. The principle underlying these algorithms is to find a depth solution that optimizes an image match measure across views. This measure is referred to as the spatial match measure. This straightforward solution, however, ignores two constraints present in multi-view image sequences.
The first constraint encodes the geometric relationship between the 3D motion of a scene point and its 2D projections in multiple synchronized images. This relationship is referred to as the scene flow constraint. By applying this constraint, temporal 2D image correspondences can be used to infer 3D scene motion and, therefore, constrain the depth information over time. For stereo images processed using optical flow techniques, accurate optical flow calculations are important in order to successfully the apply scene flow constraint directly to depth estimation. These calculations are important because the effects of unreliable flow at object boundaries and in untextured regions propagate into the final depth map.
The second constraint arises from the observation that objects in the scene usually deform or move smoothly. Applying this constraint helps to obtain temporally consistent depth solutions and to eliminate ambiguities that may not be easy to resolve at any single time instant. This constraint has been applied using rigid parametric motion models, such as that described in an article by K. Hanna et al. entitled “Combining Stereo and Motion Analysis for Direct Estimation of Scene Structure” Proc. Int. Conf. on Computer Vision, pp. 357-365, 1993. The constraint has also bee applied using non-rigid parametric motion models, such as that disclosed by Y. Zhang et al. in a paper entitled “Integrated 3D Scene Flow and Structure Recovery from Multiview Image Sequences,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR '00), pp. II-674-681, 2000.