1. Field of the Invention
The present invention relates to a motion capture apparatus and method used in various fields producing animations, movies, broadcasting contents, games, and the like, and more particularly, to an apparatus and method for a high-speed marker-free motion capture.
2. Discussion of the Related Art
As well known to those skilled in the art, motion capture techniques are widely used to make character's natural and vivid animation in various fields of producing 3-dimensional image contents such as animations, movies, broadcasting contents, games, etc.
According to kinds of sensors attached to an actor's joints, conventional motion capture techniques are divided into a magnetic type measuring positions using a variation amount of a magnetic field, a mechanical type directly measuring a bending of joints using a mechanical method, an optical type using images of passive (infrared rays) or active (LED, color) markers obtained by a camera, and an optical fiber type using a variation amount in transmission of light according to a bend degree of joints.
However, the conventional motion capture techniques have a disadvantage that sensors or makers must be attached to an actor's clothes or body and they must be operated under limited space and illumination conditions.
Meanwhile, in one method for detecting a human body's particular portion such as a head, hands and feet, corresponding pixels are made into a blob models and then adjacent pixels having a similar attribute in an image are compared with the blob models. In another method, based on a contour of a human body, the human body's particular portion is detected using type and strength of the contour. In case of making the blob models, the process of making the pixels having the similar attribute into the blobs is complicated. When there are many noises, it is difficult to make the blobs. Since the topology of the blobs is changed much according to a 3-dimensional movement, all the blob models with respect to various angles cannot be always made, so that it is difficult to achieve a stable and high-speed detection.
In practice, in the case of detecting a human body's end portion based on the contour, there may be a movement type that is not detected.
For example, if hands are raised over a head to cover the head, the contour that has formed in the head is not formed so that the head is not detected. Even when the contour is formed, it is difficult to stably detect the head because the corresponding contour is much different from a general contour of the head. In addition, if the hands are positioned at the front of the body, the contour corresponding to the hands disappears so that it is impossible to detect the hands. When detecting the food, if the foot is fixed to the ground, it can be stably detected without any change of attributes such as the type and strength of the contour. However, if the foot is raised, it is difficult to correctly detect the actual position of the foot due to the variation of the illumination condition, the magnitude variation of an image displayed on an imaging device, and the like.
In the above-described motion capture techniques, a method for tracking a human body's particular portion in the 3-dimensional space has a disadvantage that it cannot handle much information of the 3-dimensional space such as a depth information because images from the camera for obtaining the information on the 3-dimensional space is tracked in 2-dimensional plane. In addition, since the body's characteristic is distinguished using the contour model of an actor, it is difficult to extract the correct positions of the feature points such as the head, hands, foot, etc. Further, there is a disadvantage that a phenomenon of an overlapping or a disappearance of the feature points cannot be processed.
In particular, as a method for estimating a position of a middle joints of an actor using a feature point of the body's end portion such as the head, the hands and the feet, a paper of N. Badler, M. Hollick and J. Granieri, entitled “Real-time Control of a Virtual Human Using Minimal Sensors” (Presence, 2(1):82-86, 1993), discloses that after attaching four magnetic sensors to the hands, a waist and a head of an actor and capturing corresponding position information, a position of a middle joint of the upper part of the body is generated using the captured information. A paper of Ryuya Hoshino, Satoshi Yonemoto, Daisaku Arita and Rin-ichiro Taniguchi, entitled “Real-time Motion Capture System Based on Silhouette Contour Analysis and Inverse Kinematics” (FCV2001, 2001, p157-163), discloses that positions of the feature points of the body are extracted using a silhouette information extracted from an actor's motion image captured by six cameras, and similar movements are selected in previously-provided database using coordinates of a middle joint and positions of captured feature points, obtained by calculating 3-dimensional coordinates of the middle joint using the extracted positions. However, the method disclosed in the former paper is limited to only the movement of the upper part of the body. Noises are contained in the captured position information due to the sensors that are sensitive to an environment change. In addition, there is a disadvantage that an operation is limited due to the attachment of the magnetic sensor. The method disclosed in the latter paper has problems that an additional database for many situations must be established and which movement to be selected in the database must be taken into consideration.