Various solutions currently exist for tracking the motion of objects or anatomic bodies for use in both research and commercial environments. Among others, conventional motion tracking schemes employ optical motion tracking, inertial motion tracking, tracking of on-body markers and sensors, and computer vision techniques.
In optical motion tracking, external cameras and a set of body-worn reflective markers are generally used to provide accurate motion tracking. Due to its capability of capturing motion in sub-centimeter levels of accuracy, optical motion tracking is currently used in various applications including entertainment, movie character animation, biomedical applications, and the like. However, optical motion tracking also presents several drawbacks. For example, optical motion tracking implementation typically involves high system costs, discomfort from wearing several tens of on-body markers, limitations in motion capture volume, susceptibility to occlusions and lighting conditions, and time consuming post-processing.
In inertial motion tracking, a set of inertial sensors are rigidly attached to most of the segments of a subject body, and the body pose is estimated by measuring the orientation of each segment with an attached sensor in relation to a biomechanical model. Commercially available products, such as the Xsens MVN system which employs inertial motion tracking technologies as disclosed in U.S. Pat. No. 8,165,844, uses roughly 17 sensors that are attached to the subject body. Such an approach overcomes several limitations associated with traditional optical motion tracking systems. For instance, as a substantially self-contained on-body solution, this approach provides more freedom of motion and capture volume. This approach also provides a level of accuracy that is very close to that of an optical system.
However, this solution still requires a relatively large number of body-worn sensors which not only comes with increased system costs, but can also cause undesirable obstruction especially in non-specialized applications and non-dedicated use scenarios, such as in gaming, monitoring daily life activities, and the like.
In an effort to mitigate the above-identified limitations, some recent developments have tried to achieve accurate motion tracking using a limited number of sensors or optical markers. Such solutions typically use machine learning and data-driven approaches in order to estimate the full-body pose. However, the low-dimensionality of the input measurements used presents yet other drawbacks. In one approach, relatively few (about six to nine) optical markers and two synchronized cameras are used. While this approach certainly offers a decrease in system complexity and costs, it still requires a costly external infrastructure, and further, exhibits fundamental limitations with respect to capture volume, freedom of use, occlusions as well as lighting conditions.
In another approach, relatively few inertial sensors are combined with a commercially available external acoustic positioning system. Although the number of body-worn sensors is decreased and performance is close to a conventional full-body motion capture system, the need for an external infrastructure for assessing sensor position relative to an external frame greatly limits the field and generalization during use. In yet another approach, about 4 accelerometer sensors placed on wrists and ankles are used to track motion of the subject body. Due to the low cost of the sensors and low power consumption requirements thereof, the sensors can be relatively easily integrated into bracelets, anklets, straps, or the like, and conveniently worn with minimal obstruction. However, the resulting performance is far from that desired in actual (high-fidelity) motion tracking.
In typical computer vision techniques, only a single external depth camera is used to capture the motion of the user. While this approach has the advantage of not requiring any sensors or markers to be worn by the tracked subject, there still exist obvious limitations in terms of accuracy, susceptibility to occlusions and lighting conditions, as well as capture volume. Furthermore, the tracked subject is confined within a relatively small area of space which is constrained to the location of the depth camera and the associated system. Although such limitations may be insignificant with regards to a specific application, such as in a gaming application, there are significant constraints on the overall freedom of motion which may be undesirable for use with other application types or in the more general context.
The present disclosure is directed to systems and methods that address one or more of the problems set forth above. However, it should be appreciated that the solution of any particular problem is not a limitation on the scope of this disclosure or of the attached claims except to the extent expressly noted. Additionally, the inclusion of any problem or solution in this Background section is not an indication that the problem or solution represents known prior art except as otherwise expressly noted.