Research on a type of user interface that detects human motion and produces an input signal corresponding to the motion has been proceeding in recent years. The feasibility of applying such a user interface to an information terminal such as a personal computer or a smart phone, for example, is also being investigated. In particular, when applying this type of user interface to operate a personal computer or the like used in an office, the problem is that it is difficult for the user to perform an operation involving a large gesture. There is therefore a need for a user interface that can recognize subtle gestures made with hands or fingers or the like. However, if subtle gestures made with hands or fingers or the like are to be recognized, a high degree of detection accuracy is demanded of the user interface.
In view of the above, there is proposed a technique for recognizing a gesture or the like by capturing left and right parallax images with a predetermined angle of parallax using a stereoscopic camera and by determining, based on the images, the parallax value for an object contained in the captured images (for example, refer to Japanese Laid-open Patent Publication No. 2011-175347).
The information processing apparatus disclosed in Japanese Laid-open Patent Publication No. 2011-175347 converts one parallax image into a grayscale image having two or more levels, and extracts an object from the grayscale image by detecting a group of contiguous pixels having the same level and contiguous in a predetermined direction. Then, for each extracted object, based on the position of the object and a predetermined maximum allowable parallax, the information processing apparatus sets a reference region in the one parallax image and a search area in the other parallax image. Then, using the image of the reference region as a template, the information processing apparatus performs template matching within the search area to search for a region similar to the reference region, and determines the parallax value for the object based on the positional displacement between the reference region and the similar region.