This application is based on Japanese Patent Application No. 10-66382, filed Mar. 17, 1998, the contents of which are incorporated herein by reference.
The present invention relates to an information input apparatus and method for inputting information in a three-dimensional space, and to a recording medium.
As an input device to a computer, a mouse is prevalently used. However, the mouse is used to merely attain roles of a two-dimensional pointing device such as movement of the cursor, selection of a menu, and the like. Since information the mouse can process in two-dimensional information, the mouse can hardly select, e.g., an object with a depth in a three-dimensional space. On the other hand, when the mouse is used to animate a character upon creating an animation, it cannot easily naturally animate the character. In order to compensate for such difficulties in pointing in a three-dimensional space, three-dimensional pointing devices have been developed. For example, a three-dimensional pointing device 150 shown in FIG. 1 allows six ways of operations, i.e., pushing a central round portion forward, pressing the center of that portion, pressing the rear end of that portion, lifting the entire portion upward, turning the entire portion clockwise, and turning the entire portion counterclockwise, and has six degrees of freedom. By assigning these six degrees of freedom to various instructions, the position (x, y, z) and directions (x-, y-, and z-axes) of a cursor in three-dimensional space can be controlled, or the view point position (x, y, z) and directions (x-, y-, and z-axes) with respect to the three-dimensional space can be controlled.
However, when this device is operated actually, the cursor or view point cannot be desirably controlled. For example, when the operator wants to turn the round portion clockwise or counterclockwise, he or she may press its forward or rear end, and the cursor or view point may move in an unexpected direction.
In place of such three-dimensional pointing device, devices that can input instructions using hand or body actions have been developed. Such devices are called, e.g., a data glove, data suit, cyber glove, and the like. For example, the data glove is a glove-like device, and optical fibers run on its surface. Each optical fiber runs to a joint of each finger, and upon bending the finger, the transmission state of light changes. By measuring the transmission state of light, the bent level of the joint of each finger can be detected. The position of the hand itself in the three-dimensional space is measured by a magnetic sensor attached to the back of the hand. If an action is assigned to a given instruction (e.g., if the index finger is pointed up, a forward movement instruction is issued), the operator can walk in the three-dimensional space by variously changing the view point using the data glove (walkthrough).
However, some problems must be solved. Such device is expensive, and can hardly be used for home use. Since the angle of the finger joint is measured, even when, for example, stretching only the index finger and bending other fingers is defined as a forward movement instruction, stretching a finger includes various states. That is, since the second joint of the index finger rarely makes 180xc2x0, it is different to recognize the stretched state except for such 180xc2x0 state of the index finger, unless a given margin is assured. Since the operator must wear the data glove, his or her natural movement is disturbed. Every time the operator wears the data glove, he or she must calibrate the transmission state of light in correspondence with the stretched and bent finger states, resulting in troublesome operations. Since optical fibers are used, failures such as disconnection of fibers may take place after continuous use of the data glove, and the data glove has a durability as low as an expendable. Despite the fact the data glove is such expensive, troublesome device, if the glove size does not just fit with the operator""s hand, the input value may deviate from the calibrated value during use due to slippage of the glove, and delicate hand actions can hardly be recognized. Owing to various problems described above, the data glove has not so prevailed contrary to initial expectation although it served as a trigger device of the VR (virtual reality) technology. For this reason, the data glove is still expensive, and has many problems in terms of its use.
By contrast, some studies have been made to input hand and body actions without wearing any special devices such as a data glove. For example, a method of recognizing hand shape by analyzing a moving image such as a video image has been studied.
However, with such method, it is very hard to extract an objective image portion (e.g., in case of hand action recognition, a hand image alone) from the background image. For example, assume that an objective image is extracted using colors. Since the hand has skin color, only a skin color portion may be extracted. However, if a beige clothing article or wall is present as a background, it is hard to recognize skin color. Even when beige is distinguished from skin color by adjustment, if illumination changes, the color tone also changes. Hence, it is difficult to steadily extract a skin color portion.
In order to avoid such problems, a method that facilitates extraction by imposing a constraint on the background image, e.g., by placing a blue mat on the background may be used. Alternatively, a method that colors finger tips to easily extract them from the background or makes the operator wear color rings may be used. However, such constraints are not practical; they are used for experimental purposes but are not put into practical applications.
The above-mentioned video image recognition such as extraction and the like requires a very large computation amount. For this reason, existing personal computers cannot process all video images (as large as 30 images per sec) in real time. Hence, it is hard to attain motion capture by video image processing in real time.
A device called a range finder for inputting a distant image is known. The typical principle of the range finder is to irradiate an object with spot light or slit light and obtain a distant image based on the position where the light reflected by the object is received by the principle of triangulation. The range finder mechanically scans spot light or slit light to obtain two-dimensional distance information. This device can generate a distant image with very high precision, but requires a large-scale arrangement, resulting in high cost. Also, a long input time is required, and it is difficult for this device to process information in real time.
A device for detecting a color marker or light-emitting unit attached to a hand or body portion from an image, and capturing the shape, motion, and the like of the hand or body portion may be used, and has already been put into some applications. However, the device has a serious demerit of user""s inconvenience, since the user must wear the device upon every operation, and the application range is limited very much. As in the example of the data glove, when the user wears the device on his or her movable portion such as a hand, the durability problem is often posed.
The problems in a conventional camera technique will be explained below in addition to the aforementioned input devices. With the conventional camera technique, in order to synthesize (chromakey) a character with a background, a character image must be photographed in front of a blue back to facilitate character extraction. For this reason, the photographing place is limited to, e.g., a studio that can photograph an image in front of a blue back. Alternatively, in order to extract a character from an image photographed in a non-blue back state, the character extraction range must be manually edited in units of frames, resulting in very cumbersome operations.
Similarly, in order to generate a character in a three-dimensional space, a three-dimensional model is created in advance, and a photograph of the character is pasted to the model (texture mapping). However, creation of a three-dimensional model and texture mapping are tedious operations and are rarely used other than applications such as movie production that justifies extravagant cost needed.
In order to solve these problems, for example, a technique disclosed in U.S. Ser. No. 08/953,667 (now U.S. Pat. No. 6,144,366) is known. This technique acquires a distant image by extracting a reflected light image. However, this technique cannot obtain hue information of an object since it extracts the reflected light image. For this reason, two different types of cameras, i.e., a conventional imaging camera and a camera for extracting a reflected light image, are required.
It is an object of the present invention to provide an information input apparatus and method that can acquire a reflected light image using a versatile image sensor, and a recording medium.
It is another object of the present invention to provide an information input apparatus and method which can attain high-level image processing such as extraction of an object image alone from the background in a normal image, and the like, and a recording medium.
In order to achieve the above objects, according to the first aspect of the present invention, an information input apparatus comprises: a light emitter for irradiating an object with light; an area image sensor for outputting a difference between charges received by light-receiving cells arranged in an array pattern from a reflected light of the object caused by the light emitter irradiating the object with light; a timing signal generator for generating a timing signal comprised of a pulse signal or a modulation signal for controlling an intensity of light of the light emitter; a control signal generator for generating a control signal for individually controlling light-receiving timings of the light-receiving cells of the area image sensor on the basis of the timing signal from the timing signal generator; and image processing section for extracting a reflected light image of the object from the difference outputted from the area image sensor.
According to the second aspect of the present invention, an information input apparatus comprises: a timing signal generator for generating a timing signal comprised of a pulse signal or a modulation signal; a light emitter for emitting light, an intensity of which changes on the basis of the timing signal from the timing signal generator; first light-receiving section for receiving light emitted by the light emitter and reflected by an object in synchronism with the timing signal from the timing signal generator; and second light-receiving section for receiving light other than the light emitted by the light emitter and reflected by the object.
According to the third aspect of the present invention, an information input method comprises the steps of: generating a pulse signal or a modulation signal; generating, on the basis of the pulse or modulation signal, a control signal for separately controlling light-receiving timings of light-receiving cells of an area image sensor for obtaining a difference between charges received by light-receiving cells which are arranged in an array pattern; emitting light, an intensity of which changes on the basis of the generated control signal; and detecting a light image reflected by an object of the emitted light.
According to the fourth aspect of the present invention, an information input method comprises the steps of: generating a pulse signal or modulation signal; emitting light, an intensity of which changes on the basis of the pulse or modulation signal; and receiving light reflected by an object of the emitted light and light other than the reflected light in synchronism with the pulse or modulation signal.
According to the fifth aspect of the present invention, an article of manufacture comprises: a computer usable medium having computer readable program code means embodied therein for causing an area image sensor for obtaining a difference between charges received by light-receiving cells which are arranged in an array pattern to be controlled, the computer readable program code means in the article of manufacture comprising: computer readable program code means for causing a computer to generate a pulse signal or a modulation signal; computer readable program code means for causing a computer to generate a control signal for separately controlling light-receiving timings of the light-receiving cells of the area image sensor on the basis of the pulse or modulation signal; computer readable program code means for causing a computer to cause a light emitter to emit light, an intensity of which changes on the basis of the generated pulse signal or modulation signal; and computer readable program code means for causing a computer to extract a light image reflected by an object of the emitted light from the difference outputted from the area image sensor.
According to the present invention, since a reflected light image can be acquired using a versatile image sensor, i.e., the versatile image sensor can be used, a cost reduction of the apparatus can be realized.
According to the present invention, high-level image processing such as extraction of an object image alone from the background in a normal image, and the like can be easily implemented.
Furthermore, according to the present invention, a reflected image and an image based on other light components can be simultaneously obtained.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.