Hand gesture is an efficient means for humans interacting with computers [9, 13, 17]. The most basic and simplest gesture is pointing. Pointing gesture can resolve ambiguities derived from the verbal communication, thus opening up the possibility of humans interacting or communicating intuitively with computers or robots by indicating objects or pointed locations either in the three dimensional (3D) space or on the screen. However, it is a challenging task to estimate the 3D hand pointing direction automatically and reliably from the streams of video data due to the great variety and adaptability of hand movement and the undistinguishable hand features of the joint parts. Some previous work show the success in hand detection and tracking using multi-colored gloves [16] and depth-aware cameras [8], or background subtraction [14], color-based detection [7, 8], stereo vision based [2, 4, 18] or binary pattern based [5, 10] hand feature detection. However, the big challenge remains for accurate hand detection and tracking in terms of various hand rotations.
Recent advances of feature detection [1, 5, 6, 8, 10, 11, 12, 19] make possible determination of hand gestures.
An expectation minimization framework for view independent recognition of hand postures is provided in [21]. This is intended to determine a gesture, and not a pointing direction.