The present invention relates to an information input apparatus and method for inputting user's gestures, and a recording medium.
As an input device to a computer, a mouse is prevalently used. The mouse is used to attain roles of a two-dimensional pointing devices such as movement of the cursor, selection of a menu, and the like.
Actual operations follow a given rule. For example, a menu is selected by successively pressing a mouse button twice (double-clicking). That is, the actual operation is not an intuitive operation. For this reason, aged users cannot often double-click as it is a non-intuitive operation.
In order to solve such problem, studies for realizing an intuitive operation for, e.g., moving the cursor in the screen to the right by moving the hand to the right have been made. One of such studies is gesture recognition for recognizing, e.g., motions of the hand by image processing.
For example, a study for recognizing the hand shape by analyzing a moving image such as a video picture has been made. When the hand shape is extracted using colors, since the hand is skin color, only a skin color portion may be extracted. However, if beige cloths or wall is present as a background, it is hard to recognize skin color. Even when beige is distinguished from skin color by adjustment, if illumination has changed, the color tone also changes. Hence, it is difficult to steadily extract a skin color portion.
Alternatively, a method of calculating a motion vector between frames, and analyzing a moving object is available. In this case, no problem is posed when the number of moving objects is small. However, if the number of moving objects is large, the number of motion vectors increases abruptly, and the load upon calculating the motion vectors between frames becomes heavier. Hence, calculation cannot catch up analysis.
In this manner, in a conventional method of capturing and analyzing an image using an imaging means such as a video camera, since the analysis flow and information to be analyzed are fixed, when the image to be analyzed changes gradually according to an external condition, the load acts on a specific processor block, and analysis cannot be made in time.
As one method of solving such problem, a high-performance computer and high-speed transmission system are used to realize real-time processing (e.g., processing for 30 images per sec) even when the load becomes heavier. However, if the external condition does not change largely, the high-performance computer and high-speed transmission system cannot exhibit their performance, resulting in very poor cost performance.
In order to compensate for such problem, as disclosed in, e.g., U.S. Ser. No. 08/953,667, an information input apparatus, which is capable of information input by a gesture since it can easily extract an image from a background to extract a motion of the hand of the user by capturing light reflected by an object in synchronism with light emission means, has been developed.
Using such information input apparatus, e.g., in the home, the ON/OFF states and the like of a TV, audio equipment, lighting equipment, and the like can be remote-controlled. In order to allow input anytime the user desires, the information input apparatus must be kept ON. Unlike the mouse or the like, since the apparatus must actively emit light, electric power for emission is required.
As described above, in a conventional image processing method, it is hard to attain low-cost, robust analysis with respect to external conditions that vary constantly. In order to attain robust analysis even under varying external conditions, a high-performance computer and high-speed transmission system must be used, resulting in too high cost. Hence, such system cannot be used in homes.