1. Field of the Invention
The present invention relates to image processing method and apparatus for reconstructing, based on images photographed by lots of cameras, an image viewed from established point of view (hereinafter called POV) and direction.
2. Related Background Art
A conventional digital camera merely photographs an image which is viewed from the position where it is set up, whereby it is impossible by the conventional digital camera to reconstruct an image which is viewed from a position different from the position where the camera is set up. Meanwhile, in a CG (computer graphics) field, a technique called image-based rendering by which an image of an arbitrary POV is generated from lots of images has been investigated.
Hereinafter, a method of reconstructing the image of the arbitrary POV from the lots of images through the image based rendering will be explained. For convenience of explanation, a camera model as shown in FIG. 9 is provided. That is, in FIG. 9, the range expanding between the dotted lines centering around the camera position (POV position) is an angle of view, the pixel positioned at the intersection point between the image constitution surface and the beam from the subject shows a color corresponding to the beam, and a gathering of such pixels constitutes the entire image photographed by the digital camera. FIG. 10 is a diagram for explaining the existing image based rendering technique on the basis of the camera model shown in FIG. 9. In FIG. 10, symbols (A), (B), (C) and (D) respectively denote actual camera photographing positions (also simply called cameras (A), (B), (C) and (D)), and symbol (X) denotes a virtual camera POV position at which camera photographing is not actually performed (also simply called virtual camera (X)). Here, if it is assumed that the color of the pixels on the beam between the POV position of the virtual camera (X) and the POV position of the camera (B) is the same (that is, any beam attenuation or the like does not occur), the color of a pixel x2 and the color of a pixel b2 are sure to become the same, whereby the pixel x2 can be inferred resultingly from the pixel b2. Likewise, a pixel x1 can be inferred from a pixel c1 of the camera (C). In the same way, an image of the virtual camera POV position (X) at which the camera photographing is not actually performed can be inferred by gathering pixel information in the photographed images from the various POV positions. Incidentally, in case of the POV position and direction of the camera (A) or (D), the beam between the POV position of the virtual camera (X) and the POV position of the cameral (A) or (D) is outside the range of the angle of view of the virtual camera (X), whereby there is no pixel capable of being used to reconstruct the image viewed from the virtual camera (X). For this reason, it is necessary to photograph lots of images viewed from the POV positions and directions, such as the POV positions and directions of the cameras (B) and (C), within the range of angle of view of the virtual camera (X).
For this reason, in the above conventional technique, lots of the photographed images are all stored once in a memory and then processed, whereby a vast capacity is necessary for the memory. On the other hand, when lots of images are photographed by using a single camera, it is necessary to photograph these images as changing one by one the POV position and direction of the camera, whereby there is a problem that it takes a long time for image photographing. Besides, there is a problem that an animation cannot be reproduced based on the images photographed by the single camera. To cope with this problem, a method of disposing lots of cameras on a network, simultaneously photograph images by these cameras, and process the lots of photographed images by using a server computer is devised. However, in that case, it is necessary to transmit lots of data of the photographed images to the server computer, whereby there is a problem that a load of the network becomes huge.