Recently, installing cameras on personal computers, game machines, and so on and taking the images of users for use in a variety of forms are generally practiced. For example, such technologies of transmitting an image of a user as it is to a mate thereof through a network as television telephone, video chat, and the like and technologies recognizing a motion of a user by image analysis so as to provide input information for games and information processing are practically used (refer to PTL 1 below, for example). Further, in these days, the accurate detection of a motion of an object in a three-dimensional space including a depth direction allows the realization of games and image expressions that provide the sense of presence higher than before.
For a general technique of obtaining a position of an object in a three-dimensional space, a stereo image technique is known. In the stereo image technique, corresponding points are detected from stereo images of a same space simultaneously taken with two cameras horizontally separated from each other by a known interval and, on the basis of a resultant parallax between the detected points, a distance from an imaged surface of an object is computed by use of the principle of triangulation.