The present invention relates to an image processing technology for using an image taken by an image pickup apparatus such as a video camera as an interface for inputting commands, etc.
A keyboard, mouse, controller, etc. are input devices often used for a computer, video game machine, etc. The operator inputs desired commands by operating these input devices to render a computer, etc. to execute processing according to the commands entered. Then, the operator sees images and listens to sound, etc. obtained as the processing results from a display device and speaker.
The operator enters commands by operating many buttons provided on the input device while watching a cursor shown on the display device.
Such operations greatly depend on operating experiences of the operator. For example, for a person who never touched the keyboard before, entering desired commands using the keyboard is quite troublesome and time-consuming, and prone to input errors due to mistyping from the keyboard. For this reason, there is a demand for a man-machine interface that will provide the operator with an easy way to operate.
On the other hand, with the progress of multimedia technologies, people in general households can now readily enjoy capturing images using a video camera into a computer, etc., editing and displaying the images on a display device. Such technologies are also used for personal authentication by analyzing images of a physical body such as a face, extracting characteristic parts thereof to identify individuals.
Conventionally, these images are used as information to be processed by a computer such as editing or analysis. However, images taken have not been used so far for a purpose such as entering commands to a computer, for example.