In recent years, gesture input, which uses hand tracking from the silhouette shape of a user's hand image captured by a colour camera, has been increasingly incorporated in products. In this context, the need for high-speed and high-accuracy three-dimensional hand pose estimation, which is a technique for detecting hand tracking, has been boosted.
On the other hand, it is desired to use three-dimensional hand pose estimation more accurately or to introduce it into application fields such as information processors and game machines to eliminate the needs for storing special body actions and poses and acquiring proficiency in operation, which have been required by conventional gesture input, and to enable intuitive operation. For instance, it is required that the technique for detecting “hand shape and pose”, as well as detecting hand tracing, be used to keep pace with actions in an information processor, game machine, etc.
One of approaches for three-dimensional hand pose estimation is a two-dimensional-appearance-based technique, which compares an input image directly with an image stored in matching database using an information processing device, without extracting characteristics from the image captured by the camera. This technique allows the hand silhouette shape captured by a camera to be used for an input image; therefore, the information processor may estimate the approximate hand silhouette shape from the displayed hand shape.
Conventionally, when “hand shape and pose” are detected from an image captured by a camera, it is difficult to estimate the hand shape correctly from its silhouette shape because the hand has the following three characteristics, (a), (b), and (c):
(a) The hand shape changes complicatedly because of the hand having an articulated structure.
(b) When finger joints are flexed or the fist is clenched, the fingers are often hidden by the back and palm of the hand due to self-shadowing in the silhouette shape of the hand.
(c) Although the ratio of the hand to the whole body is small, the hand has a wide range of motion.
The inventor of the present invention added nail position information so as to estimate the “hand shape and pose” more accurately, because the “hand shape and pose” could not be accurately detected only from the silhouette shape of the hand as mentioned above (see, for instance, non-patent document 1). Moreover, the inventor of the present invention demonstrated that the efficiency of hand shape and pose estimation could be improved using the nail position information in terms of the structure of matching database, in which images for comparison directly with input images are contained (see, for instance, non-patent document 2).