This application is based on Japanese Patent Application No. 10-31659, filed Feb. 23, 1998, the contents of which are incorporated herein by reference.
The present invention relates to an information input apparatus which attains pointing in a three-dimensional space using an image.
As an input device to a computer, especially, a pointing input device, a mouse is prevalently used, since most computers equip it. However, the mouse is used to merely attain roles of a two-dimensional pointing device such as movement of the cursor, selection of a menu, and the like.
Since information the mouse can process is two-dimensional information, the mouse can hardly select, e.g., an object with a depth in a three-dimensional space. On the other hand, when the mouse is used for animating a character upon creating an animation, it cannot easily naturally animate the character.
In order to compensate for such difficulties in pointing in a three-dimensional space, various three-dimensional pointing devices have been developed.
As a typical three-dimensional pointing device, for example, a device shown in FIG. 1 is known. This three-dimensional pointing device allows six ways of operations, i.e., xe2x80x9cpushing a central round control knob 150 forwardxe2x80x9d, xe2x80x9cpressing the center of the knob 150xe2x80x9d, xe2x80x9cpressing the rear end of the knob 150xe2x80x9d, xe2x80x9clifting the entire knob upwardxe2x80x9d, xe2x80x9cturning the entire knob 150 clockwisexe2x80x9d, and xe2x80x9cturning the entire knob 150 counterclockwisexe2x80x9d, and has six degrees of freedom.
By assigning these six degrees of freedom to various operation instructions, the position (x, y, z) and directions (x-, y-, and z-axes) of a cursor in a three-dimensional space can be controlled, or the view point position (x, y, z) and directions (x-, y-, and z-axes) with respect to the three-dimensional space can be controlled.
However, when this device is operated actually, the cursor or view point cannot be desirably controlled.
For example, when the operator wants to turn the knob clockwise or counterclockwise, he or she may press its forward or rear end, and the cursor or view point may move in an unexpected direction.
In place of such three-dimensional pointing device, devices that can input instructions using hand or body actions have been developed.
Such devices are called, e.g., a data glove, data suit, cyber glove, and the like. For example, the data glove is a glove-like device, and optical fibers run on its surface. Each optical fiber runs to a joint of each finger, and upon bending the finger, the transmission state of light changes. By measuring the transmission state of light, the bent level of the joint of each finger can be detected. The position of the hand itself in the three-dimensional space is measured by a magnetic sensor attached to the back of the hand. If an action is assigned to a given instruction (e.g., if the index finger is pointed up, a forward movement instruction is issued), the operator can walk in the three-dimensional space by variously changing the view point using the data glove (walkthrough).
However, such device suffers some problems.
First, such device is expensive, and can hardly be used for home use.
Second, operation may often be erroneously recognized. Since the angle of the finger joint is measured, even when, for example, a state wherein the operator stretches only his or her index finger and bends other fingers is defined as a forward movement instruction, such state may be erroneously recognized as another instruction. More specifically, stretching a finger includes various states. That is, since the second joint of the index finger rarely makes 180xc2x0, it is different to recognize the stretched state except for such 180xc2x0 state of the index finger, unless a given margin is assured.
Third, since the operator must wear the data glove, his or her natural movement is disturbed.
Fourth, every time the operator wears the data glove, he or she must calibrate the transmission state of light in correspondence with the stretched and bent finger states, resulting in troublesome operations.
Fifth, a problem of failures remains unsolved. That is, after continuous use of the data glove, failures such as disconnection of fibers may take place, and the data glove has a durability as low as an expendable.
Sixth, despite the fact the data glove is such expensive, troublesome device, if the glove size does not just fit with the operator""s hand, the input value may deviate from the calibrated value during use due to slippage of the glove, and delicate hand actions can hardly be recognized.
Owing to various problems described above, the data glove has not so prevailed contrary to initial expectation although it served as a trigger device of the VR (virtual reality) technology. For this reason, the data glove is still expensive, and has many problems in terms of its use.
By contrast, some studies have been made to input hand and body actions without wearing any special devices such as a data glove.
As a typical study for inputting hand or body actions, for example, a method of recognizing hand shape by analyzing a moving image such as a video image is known.
However, in this method, an objective image (in case of hand action recognition, a hand image alone) must be extracted from the background image, but it is very hard to extract the objective image portion.
For example, assume that a xe2x80x9chandxe2x80x9d as an objective image is extracted using colors. Since the hand has skin color, only a skin color portion may be extracted. However, if a beige clothing article or wall is present as a background, it is hard to recognize skin color, and such method is far from reality. Even when beige is distinguished from skin color by adjustment, if illumination changes, the color tone also changes. Hence, it is difficult to steadily extract a skin color portion.
In order to avoid such problems, a method that facilitates extraction by imposing a constraint on the background image, e.g., by placing a blue mat on the background may be used. Alternatively, a method that colors finger tips to easily extract them from the background or makes the operator wear color rings may be used. However, such constraints are not practical; they are used for experimental purposes but are not put into practical applications.
The above-mentioned video image recognition such as extraction and the like requires a vary large computation amount. For this reason, existing personal computers cannot process all video images (as large as 30 images per sec) in real time. Hence, it is hard to attain motion capture by video image processing in real time.
As another method of inputting hand or body actions by analyzing a moving image such as a video image, a method using a device called a range finder for inputting a distant image is known.
The typical principle of the range finder is to irradiate an object with spot light or slit light and obtain a distant image based on the position where the light reflected by the object is received by the principle of triangulation. The range finder mechanically scans spot light or slit light to obtain two-dimensional distance information. This device can generate a distant image with very high precision, but requires a large-scale arrangement, resulting in high cost. Also, a long input time is required, and it is difficult for this device to process information in real time.
As still another method of inputting hand or body actions by analyzing a moving image such as a video image, a device for detecting a color marker or light-emitting unit attached to a hand or body portion from an image, and capturing the shape, motion, and the like of the hand or body portion may be used. This device has already been put into some applications. However, the device has a serious demerit of user""s inconvenience, since the user must wear the device upon every operation, and the application range is limited very much. As in the example of the data glove, when the user wears the device on his or her movable portion such as a hand, the durability problem is often posed.
As described above, various three-dimensional pointing device systems are available. However, a promising system in the future is presumably the one that analyzes and uses a moving image such as a video image without forcing the operator to wear any device or to operate any device directly.
With a conventional camera technique, in order to synthesize (chromakey) a character with a background, a character image must be photographed in front of a blue back to facilitate character extraction. For this reason, the photographing place is limited to, e.g., a studio that can photograph an image in front of a blue back. Alternatively, in order to extract a character from an image photographed in a non-blue back state, the character extraction range must be manually edited in units of frames, resulting in very cumbersome operations.
Similarly, in order to generate a character in a three-dimensional space, a three-dimensional model is created in advance, and a photograph of the character is pasted to the model (texture mapping). However, creation of a three-dimensional model and texture mapping are tedious operations and are rarely used other than applications such as movie production that justifies extravagant cost needed.
In order to solve these problems, for example, a technique disclosed in U.S. Ser. No. 08/953,667 is known. This technique acquires a distant image by extracting a reflected light image. However, this technique cannot use commercially available sensor arrays.
As described above, in recent years, needs and requirements for three-dimensional inputs are increasing, but no direct-pointing input apparatuses that can easily input a gesture or motion without making the user wear any special devices are available.
Hence, development of a practical, simple three-dimensional input apparatus which can easily attain pointing or a change in view point in a three-dimensional space has been demanded.
It is an object of the present invention to provide a practical three-dimensional information input apparatus which can easily attain pointing or a change in view point in a three-dimensional space, and naturally animate an animation character directly using a user""s gesture or motion.
In order to achieve the above object, according to the present invention, an information input apparatus for obtaining a difference image between object images corresponding to irradiated and non-irradiated states, comprises a light emitter for irradiating an object with light, an area image sensor having imaging units constructed by a two-dimensional matrix of a plurality of light-receiving elements that perform photoelectric conversion, and a plurality of CCD type charge transfer sections for transferring and outputting charges obtained by the imaging units, and a controller for controlling charge transfer timings from the light-receiving elements to the CCD type charge transfer sections to alternately arrange charges received when the light emitter emits light and charges received when the light emitter does not emit light in a predetermined sequence in all or the individual CCD type charge transfer sections of the area image sensor.
Further, according to the present invention, an apparatus for obtaining a difference image between object images corresponding to irradiated and non-irradiated states, comprises: a light emitter for irradiating an object with light; an area image sensor having imaging units constructed by a two-dimensional matrix of a plurality of light-receiving elements that perform photoelectric conversion, and a plurality of CCD type charge transfer means for transferring and outputting charges obtained by the imaging units; a controller for controlling charge transfer timings from the light-receiving elements to the CCD type charge transfer means to alternately arrange charges received when the light emitter emits light and charges received when the light emitter does not emit light in a predetermined sequence in all or the individual CCD type charge transfer means of the area image sensor; a delay line for delaying an output signal from the area image sensor by one horizontal scan time; and a difference circuit, one input of which is connected to the delay line, the other input of which is connected to the area image sensor, and which outputs a difference between two input signals.
Further, according to the present invention, an information input apparatus for obtaining a difference image between object images corresponding to irradiated and non-irradiated states, comprises: invisible-radiation emitting section for irradiating an object with invisible radiation; an are image sensor having imaging units constructed by a two-dimensional matrix of a plurality of invisible-radiation-receiving elements which convert invisible radiation into electrical signals, and a plurality of CCD type charge transfer section for transferring and outputting charges obtained by the imaging units; and a controller for controlling charge transfer timings from the invisible-radiation-receiving elements to the CCD type charge transfer section to alternately arrange charges received when the invisible-radiation emitting section emits invisible radiation and charges received when the invisible radiation emitting section does not emit invisible radiation in a predetermined sequence in all or individual CCD type charge transfer section of the area image sensor.
With this arrangement, since an image formed by alternately arranging object image pixels corresponding to emission and non-emission states in units of pixels can be directly acquired from the CCD type area image sensor by controlling the timings of the two-dimensional matrix of light-receiving elements of the CCD type area image sensor, a difference image can be obtained in real time by extracting differences between pixels, and a reflected image of, e.g., a hand can be easily acquired in real time. This can obviate the need for extraction of an object image, which is most difficult in conventional image processing, and bottlenecks application of image processing. Hence, the present invention can easily and stably provide various kinds of image processing, which are difficult to put into practice in conventional methods, with low cost using commercially available components, and can bring about drastic innovations in a broad range of market such as industries, home, entertainment, and the like.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.