To develop a practical head pose tracking method, not only accuracy but also time-efficiency and robustness should be taken into account.
A RGB-Depth camera may provide both color and depth information of a scene captured thereby. Most previous head pose estimation/tracking methods merely use the color information. As RGB-Depth camera becomes affordable, more and more researches are focused on depth information, which is more immune to the illumination changes and therefore makes the head pose tracking across adjacent frames robust. One class of depth-based head pose estimation works on a frame-by-frame basis, but typically has a lower accuracy and a higher complexity. Other classes involve a process of using sparse face model consisting of dozens of face vertices, or using dense face template consisting of thousands of vertices to track the head pose. However, such face template is either reconstructed offline or extracted from the first frame of the depth video, which will make the pose estimation less practical and less robust.