Systems that employ head and body tracking are an emerging area of research and development. Several algorithmic paradigms for head and body tracking have been investigated, including the use of models, both rigid and non-rigid, and statistical classification relying upon generating a feature set to detect, e.g. with randomized decision forests, and then going through a training stage. When performing object tracking, objects such as faces are typically easier to detect than textureless objects, such as fingers.
Model-based approaches typically operate by applying model-dependent hypotheses to observable visual data. This can be framed as an optimization problem in which an objective function is minimized with respect to the divergence between collected visual data and data hypothesized by a model. Geometric models are commonly used to represent body parts. For the head, ellipsoidal, cylindrical, as well as more sophisticated models are commonly employed. The head itself can be modeled as a rigid element, and facial muscles can be optionally incorporated as non-rigid moving components. One drawback to model-based approaches is the computational cost of solving the optimization problem, especially when more sophisticated geometric models are involved. The advantage, however, is that model-based approaches do not require a training stage.
Training-based approaches typically operate by matching key points from a feature set, i.e. salient points on an image, between sample inputs and collected visual data. The crucial step in training-based approaches is to train on as large a set of sample images as possible. Increasing the number of variations in anatomy, pose, orientation, perspective, and scale can result in greater robustness, though coming at the expense of longer training time. The advantage of training-based approaches over model-based approaches is increased run-time efficiency. Thus, training-based approaches can be regarded as converting an optimization problem to a classification problem and thereby shifting the computational burden to a training stage.