The problem of recovering 3D deformable models of non-rigid objects from a video is of intense interest in the field of computer vision. Linear models of variability are particularly desirable. For example, eigenface models have been widely used to model 2D image variability since the 1980's. Eigenface models use a variance-reducing dimensionality-reduction for coding and decoding face images, see U.S. Pat. No. 5,164,992 “Face recognition system” issued to Turk, et al. on Nov. 17, 1992. Eigenface methods describe variation between images but do not shed any light on the 3D structure of scenes and objects from which the images were generated.
A first class of methods addresses special cases of the recovery problem that are well-constrained by additional information. For example, depth estimates are available from multi-camera stereo cameras or laser range-finders; the objects are rigid; object surfaces are specially decorated with textures or markers to make inter-image correspondences obvious; or structured light is used to reveal contours of the object. These constrained methods require various combinations of high-quality, high-resolution videos, calibrated cameras, special lighting, and careful posing.
A second class of methods relaxes image constraints but nevertheless depends on having a pre-computed set of possible models or motion, see Blanz et al., “A morphable model for the synthesis of 3D faces,” Proc. SIGGRAPH99, 1999, Bregler et al., “Non-rigid 3D shape from image streams,” Proc. CVPR, 2000. However, they do not address the case of both unconstrained motion and no prior understanding of the shape and motion of the object in the video.
Therefore, there is a need to recover 3D models from non-rigid objects in unconstrained videos so that the models can be used to generate an entirely new video where the objects are posed and deformed in novel ways.