The present invention relates to an image processing apparatus and image processing method used for three-dimensional CAD (computer-aided design) for aiding in designing industrial parts and so on, the creation of images using three-dimensional CG (computer graphics), or HI (human interface) using portraits, and mobile robots, and suitable for aiding in forming the 3D model of objects.
The present invention relates to an image processing apparatus and method for realizing a head-tracking for following the movement of the head of a person, a video compression capable of decreasing the data amount required for image communication by extracting the motion vector of the person in teleconference, videophone, and the like, three-dimensional pointer for performing pointing on a virtual reality including application to games, and so on.
Recently, there have been increasing demands for computer processes relating to the images which includes image generation, image processing, and image recognition, such as three-dimensional CAD (computer-aided design) for aiding in designing industrial parts, the creation of images using three-dimensional CG (computer graphics), or HI (human interface) using portraits.
In those processes, it is necessary to digitize the geometrical shape of a target object, its surface attribute and, if necessary, the motion data before inputting them. The input processing is refereed to as modeling and the digitized data inputted is called a model. At present time, the modeling work requires a lot of manual labor. Therefore, there have been strong demands for automation from the points of view of productivity and cost.
In this connection, the approach of automatically determining the shape and surface attribute of a target object by analyzing the images obtained from a camera. Specifically, an attempt has been made to effect the automatic modeling of an object in the environment by the stereographic method of determining the distance under the principle of triangulation using a plurality of cameras or by the approach of determining the distance by analyzing the images obtained by changing the focal length of a single camera.
When only a single camera is used, the object has to stay still in changing the focal length. When a plurality of cameras are used in the stereoscopic method, cost is higher than when a single camera is used. To shoot a moving object in different directions with a single camera and use the stereographic method, this type of device provides more sensory operation than the three-dimensional mouse, it has disadvantages in that the user has to wear special gloves. Three-dimensional pointing by use of the whole body requires the sensing of the posture and the head's movement. To do this, a special device provided with a sensor has to be put on each section to be sensed. This is not practicable.
To overcome this problem, the following method can be considered: the movement of the operator is shot with a television camera; the movement and state of the image is analyzed; and three-dimensional pointing is done to give operation instructions on the basis of the result of the analysis. This enables the operator to give instruction by moving the operator's fingers and body, which makes it easier to operate the system. Therefore, this method seems to solve the problem at a stroke.
The concrete steps to solve the problem are now under study. At present, any practical approach for the problem has not been established yet.
In the fields of teleconference and videophone, efforts have been made to reduce the amount of transmitted data by using the technique for storing specific models beforehand, generating a blink or creating the shape of a mouth according to speech, and adding these to the models. Presently, however, the the problem of correlating the direction of an object with the relative change of the direction of illumination must be solved yet. This problem, however, has not been solved. Accordingly, the technique for automatically creating a model of a moving object in a complex shape, such as the head of a person, with a practical accuracy and processing speed has not been achieved yet.
In recent years, as images using three-dimensional CG (computer graphics), not two-dimensional CG, are being used more and more in the field of video games and VR (virtual reality), the need for a three-dimensional mouse serving as a pointing device for three-dimensional images has been increasing. To meet the need, various types of three-dimensional mouse have been developed.
For instance, a three-dimensional mouse enabling movement and a pointing action in a three-dimensional space by the operation of the buttons on the device has been developed. Although three-dimensional pointing can be done, it is difficult for users to do work in a three-dimensional space because the display on which images appear is two-dimensional.
Gloves provided with sensors for sensing the movement of the joint portions of fingers or the like have also been developed. Users can do work in a three-dimensional space, wearing the gloves. Although technique for acquiring information on the movement of a target object in the image has not been established.
As described above, with the conventional methods, when the focal length is changed, the object must keep still. Using a plurality of cameras in the stereographic method costs more than using a single camera.
To shoot a moving object in different directions with a single camera and use the stereographic method, the problem of correlating the direction of an object with the relative change of the direction of illumination must be solved yet. This problem, however, has not been solved. Therefore, it has been impossible to form a practical model of a moving object in a complex shape, such as the head of a person.