A number of different methods and systems are known in the art for creating depth maps. In the present patent application and in the claims, the term “depth map” refers to a representation of a scene as a two-dimensional matrix of pixels, in which each pixel corresponds to a respective location in the scene and has a respective pixel value indicative of the distance from a certain reference location to the respective scene location. (In other words, the depth map has the form of an image in which the pixel values indicate topographical information, rather than brightness and/or color of the objects in the scene.) Depth maps may be created, for example, by detection and processing of an image of an object onto which a laser speckle pattern is projected, as described in PCT International Publication WO 2007/043036 A1, whose disclosure is incorporated herein by reference.
Various methods are known in the art for computerized three-dimensional (3D) modeling of human forms from video or other image data. Some of these methods require that dedicated markers be attached to the subject's body at known locations in order to facilitate tracking of the subject's motion. A marker-based method of this sort is described, for example, by Fua et al., in “Human Shape and Motion Recovery Using Animation Models,” 19th Congress, International Society for Photogrammetry and Remote Sensing (Amsterdam, July, 2000), which is incorporated herein by reference.
Other methods use multiple cameras to provide 3D stereo image information, and process this information to model human (or other) forms. For example, Allard et al. describe such a method in “Marker-less Real Time 3D Modeling for Virtual Reality,” Immersive Projection Technology (IPT 2004, Iowa State University), which is incorporated herein by reference. An alternative method, which applies a probabilistic model of human shape and motion to two-dimensional (2D) images, is described by Howe et al., in “Bayesian Reconstruction of 3D Human Motion from Single-Camera Video,” Advances in Neural Information Processing Systems 12 (Denver, Colo., 1999), which is also incorporated herein by reference.
Li et al. describe the use of a “stereo camera” in providing depth information in “Real-time 3D Motion Recognition with Dense Depth Map,” (Report CPSC525, Department of Computer Science, University of British Columbia, Vancouver, BC, 2004), which is incorporated herein by reference. A statistical background subtraction method is applied to the input video stream froth the stereo camera, and a correlation-based stereo method is used to give a dense depth map of the foreground object. An optical flow approach is then used to detect motion of the object based on the depth map.
Grammalidis et al. describe another depth-based method in “3-D Human Body Tracking from Depth Images Using Analysis by Synthesis,” Proceedings of the IEEE International Conference on Image Processing (CIP 2001, Thessaloniki, Greece), pp. 185-188, which is incorporated herein by reference. The method is directed to estimating MPEG-4 Body Animation Parameters (BAPs) of a human arm. The rigid parts of the MPEG4 body model are initially scaled to the geometric properties of the corresponding body part, which is imaged by a camera. BAP values are then estimated using a technique based on the Downhill Simplex minimization method.