The present invention relates to an information extraction apparatus and a method to extract information such as a moving vector or a position of an object from a distance image.
As a playing apparatus such as an arcade game machine, games using a human""s movement such as boxing games and shooting games have been developed. In the boxing game, the pressure of the user""s actual blow to a sandbag is physically measured. In this case, the pressure at the moment when the user""s punch hits the sandbag is only measured. Information such as the path and speed of the user""s punch is not used. Therefore, this is a monotonous game to pressure strength of a simple punch.
In the shooting game, a light emission source is attached to a head of a toy gun and a position on a screen where the light from the toy gun hits is detected. When the trigger of the gun is pulled, the lighted object is determined to be hit by a shot. Therefore, this is not a game to test the a skill of shooting, but a simple game to compete the user""s reflex to quickly find a target and shoot.
On the other hand, by analyzing an image input by a CCD camera, a method to recognize a position or movement of a human""s body is used. In this method, the CCD camera inputs an image toward the human""s advance direction. In case of extracting the human from the image, for example, the face and hand of the user is skin color. As a preprocessing, useless parts such as background are excluded and the human part is extracted from the image by using the skin color. Then, the shape and movement of the human part are recognized.
First, the pre-processing for a recognition object is explained. In the prior art, in case of extracting the object from the input image, the extraction is executed by using a difference feature between the object and other part. As the difference feature, a change of hue or a difference image is used. In case of using the change of hue, a part whose difference of hue is large is extracted, thinning is executed for the part and an edge part is extracted. When a human is the object, by using the skin color of face and hand of the human, a hue part of the skin color is only extracted. However, the skin color itself changes in proportion to color and angle of illumination. If the hue of the background is similar to the skin color, it is difficult to discriminate the human part from the background in the image. In a condition such as non-illumination, the color of all or part of the input image becomes dark. Therefore, it is also difficult to discriminate the human part in the input image.
In case of extracting information such as the movement or the shape of the target object from the image, a plurality of the target objects are sometimes included in the image. For example, the user works clay, or transforms a virtual photograph by pulling corners of the photograph by both hands. In short, in case of using both hands as the operational motion, each hand (right hand, left hand) must be independently recognized. In this case, an image for the right hand and an image for the left hand are respectively inputted and recognized. In order to recognize both hands independently, two cameras must be used. As a result, the cost of this method is high and the calculation is so heavy.
As another method, a moving object is analyzed by calculating a moving vector (called an optical flow) between frames of a video image. In this method, if many objects are included in the video image, the number of optical flows suddenly increases. Therefore, a calculation load of the optical flows between frames also increases and this calculation processing can not overtake.
Furthermore, in case of extracting movement of human from the image, many useless movements are included in the image. Therefore, it is too difficult to correctly extract the movement of a target. For example, in case of playing a theatrical shooting game by the user""s hand instead of a gun, the human""s hand is unstable for positioning. Assume that the user shoots at a target using his hand as a gun shape. The user determines the shooting position and direction by a forefinger as a muzzle of the gun, and virtually fires at a moment when the user pulls the thumb as a trigger. In this case, the direction of the forefinger slightly changes between the positioning moment and the firing moment. If the user""s shooting skill improves, he can pull the thumb without changing the direction of the forefinger. However, the special skill requirement for all users prevents them to easily enjoy this virtual shooting game.
Furthermore, instead of pulling the thumb after positioning of the forefinger, assume that the user""s hand as the gun shape approaches the target in order to virtually pull the trigger and fire. In this case, while approaching the user""s hand, the direction and the position of the forefinger change. If the user""s hand approaches without moving the forefinger, the user feels an unnatural force in the lower limbs and has a cramp.
The same problem is well known as movement of the hands in taking a picture. In the case of a camera, the object to be photographed stands still. In short, assuming a still picture, a prevention function of the movement of the hands is provided by the camera. However, in case of a jesture using the hands or body, movement of the hands or the body is assumed. Therefore, a method to stabilize the hand""s operation such as the prevention function is not adopted.
It is an object of the present invention to provide an information extraction apparatus and a method to extract the movement and the position of each object from image even if a plurality of objects are included in the distance image.
It is another object of the present invention to provide the information extraction apparatus and a method to correctly extract the movement of objects without useless movement of other parts of the distance image.
According to the present invention, there is provided an information extraction apparatus, comprising: distance image input means for inputting a distance image including a plurality of objects to be respectively recognized; area division means for dividing the distance image into a plurality of areas in correspondence with active event; and image processing means for recognizing the plurality of objects respectively included in the area, and for supplying the recognition result to the active event.
Further in accordance with the present invention, there is also provided an information extraction apparatus, comprising: a distance image input means for inputting a plurality of distance images including an object to be recognized; a basis image extraction means for extracting a basis part of the object for moving from the plurality of distance images as a basis image; and image processing means for calculating a difference image between the basis image and the distance image, and for recognizing the moving of the object in the difference image.
Further in accordance with the present invention, there is also provided an information extraction method, comprising the steps of: inputting a distance image including a plurality of objects to be recognized; dividing the distance image into a plurality of areas in correspondence with active event; recognizing the plurality of objects respectively included in the area; and supplying a recognition result to the active event.
Further in accordance with the present invention, there is also provided an information extraction method, comprising the steps of: inputting a plurality of distance images including an object to be recognized; extracting a basis part of the object for moving from the plurality of distance images as a basis image; calculating a difference image between the basis image and the distance image; and recognizing the movement of the object in the difference image.
Further in accordance with the present invention, there is also provided a computer readable memory containing computer readable instructions, comprising: instruction means for causing a computer to input a distance image including a plurality of objects to be recognized; instruction means for causing a computer to divide the distance image into a plurality of areas in correspondence with active event; instruction means for causing a computer to recognize the plurality of objects respectively included in the area; and instruction means for causing a computer to supply a recognition result to the active event.
Further in accordance with the present invention, there is also provided a computer readable memory containing computer readable instructions, comprising: an instruction means for causing a computer to input a plurality of distance images including an object to be recognized; an instruction means for causing a computer to extract a basis part of the object for moving from the plurality of distance images as a basis image; an instruction means for causing a computer to calculate a difference image between the basis image and the distance image; and an instruction means for causing a computer to recognize the moving of the object in the difference image.