1. Technical Field
This invention relates to pose and action recognition, and specifically to an improved system, method, and computer-readable instructions for recognizing pose and action of articulated objects with collection of planes in motion.
2. Discussion of the Background
Conventional techniques for pose estimation detect various objects of a subject such as body parts of a human. In the study of human motion, set of body points is widely used to represent a human body pose. Additionally, pose estimation can determine an orientation of the body part. Previously proposed projective invariants are dealing with a set of stationary points, and almost exclusively derived from cross-ratio, which is an invariant of a set of points on a rigid object. The only exception in the literature was the study of invariants defined with Cartan mobile frames. The difficulty with the latter is that it deals only with invariants of evolution of a curve, which are non-linear, are not easy to generalize to point sets in 3D space, and cannot be decomposed into motions of planes with well-studied properties.
Specifically, human action recognition has been the subject of extensive studies in the past. The main challenges are due to perspective distortions, differences in viewpoints, unknown camera parameters, anthropometric variations, and the large degrees of freedom of articulated bodies. To make the problem more tractable, researchers have made simplifying assumptions on one or more of the following aspects: (1) camera model, such as scaled orthographic or calibrated camera; (2) camera pose, i.e. little or no viewpoint variations; (3) anatomy, such as isometry, coplanarity of a subset of body points, etc.
There are mainly two lines of research to tackle view invariance: One is based on the assumption that the actions are viewed by multiple cameras, and the second is based on assuming that the actions are captured in monocular sequences by stationary cameras. The obvious limitation of multi-camera approach is that most practical applications are limited to a single camera. In the second category several ideas have been explored, e.g. the invariants associated with a given camera model, such as affine, or projective, rank constraints on the action space represented by a set of basis functions, or the use of epipolar geometry induced by the same pose in two views.
A number of patents exist which relate to pose and action recognition, including, U.S. Pat. Nos. 7,317,836, 7,158,656, 6,944,319, 6,941,239, 6,816,632, 6,741,756, 20030169906, 20030235334, 20040120581, 20040240706, 20050265583; all of which are incorporated herein by reference.
Accordingly, there is a need in the art for a computer-implemented system and method for recognizing pose and action of articulated objects with collection of planes in motion. The present invention is designed to address these needs.