An attempt has been made to sense a real space by an image sensing apparatus mounted on a mobile, and to express the sensed real space as a virtual space using a computer on the basis of the sensed photo-realistic image data (see, e.g., Endo, Katayama, Tamura, Hirose, Watanabe, & Tanikawa: “Method of Generating Image-Based Cybercities By Using Vehicle-Mounted Cameras” (IEICE Society, PA-3-4, pp. 276-277, 1997), or Hirose, Watanabe, Tanikawa, Endo, Katayama, & Tamura: “Building Image-Based Cybercities By Using Vehicle-Mounted Cameras (2)-Generation of Wide-Range Virtual Environment by Using Photo-realistic Images—” (Proc. of the Virtual Reality Society of Japan, Vol.2, pp. 67-70, 1997), and the like).
As a method of expressing a sensed real space as a virtual space on the basis of photo-realistic image data sensed by an image sensing apparatus mounted on a mobile, a method of reconstructing a geometric model of the real space on the basis of the photo-realistic image data, and expressing the virtual space using a conventional CG technique is known. However, this method has limits in terms of the accuracy, exactitude, and reality of the model. On the other hand, an Image-Based Rendering (IBR) technique that expresses a virtual space using a photo-realistic image without any reconstruction using a model has attracted attention. The IBR technique generates an image viewed from an arbitrary viewpoint on the basis of a plurality of photo-realistic images. Since the IBR technique is based on photo-realistic images, it can express a realistic virtual space.
In order to build a virtual space that allows walkthrough using such IBR technique, an image must be generated and presented in correspondence with the position in the virtual space of the user (who experiences walkthrough). For this reason, in such system, respective frames of photo-realistic image data and positions in the virtual space are saved in correspondence with each other, and a corresponding frame is acquired and reproduced on the basis of the user's position and visual axis direction in the virtual space.
As a method of acquiring position data in a real space, a positioning system using an artificial satellite such as GPS (Global Positioning System) used in a car navigation system or the like is generally used. As a method of determining correspondence between position data obtained from the GPS or the like and photo-realistic image data, a method of determining the correspondence using a time code has been proposed (Japanese Patent Laid-Open No. 11-168754). With this method, the correspondence between respective frame data of photo-realistic image data and position data is determined by determining the correspondence between time data contained in position data, and time codes appended to the respective frame data of photo-realistic image data.
The walkthrough process in such virtual space allows the user to view a desired direction at each position. For this purpose, images at respective positions may be saved as a panoramic photo-realistic image that can cover a broader range than the field angle upon reproduction, and a partial image to be reproduced may be extracted from the panoramic photo-realistic image on the basis of the user's position and visual axis direction in the virtual space, and the extracted partial image may be displayed.
Conventionally, in order to obtain a panoramic photo-realistic image, for example, a plurality of cameras are used. In general, a plurality of cameras are arranged in a radial pattern so that their visual fields can cover a desired visual field (e.g., a full-view (360°) visual field or the like). Images obtained by the respective cameras are temporarily stored in a storage device using, e.g., an optical or magnetic storage medium.
Then, by joining images sensed by the respective cameras at a given position, a panoramic photo-realistic image at that position can be obtained. In this case, in order to reduce hue or brightness discontinuities generated at the seams of neighboring images sensed by the respective cameras, overlapping portions of images sensed by neighboring cameras are continuously blended.
As a data format of a panoramic photo-realistic image, broad visual field (including a full view) images at an identical time from one viewpoint are preferably used. In order to sense such images, an apparatus, which senses images reflected by respective surfaces of a pyramid mirror by a single camera, is used. FIG. 3 shows an example of such apparatus.
As shown in FIG. 3, a pyramid mirror 12 is made up of plane mirrors as many as a plurality of cameras which form a camera unit 11. Each of the cameras which form the camera unit 11 senses a surrounding visual scene reflected by the corresponding plane mirror. If the cameras are laid out so that the virtual images of the lens centers of the respective cameras formed by the plane mirrors match, images can be sensed at an identical time from one viewpoint. Hence, by compositing the images sensed by the respective cameras, a full-view panoramic image at an identical time from a single viewpoint can be generated.
However, since the image sensing apparatus shown in FIG. 3 forms nearly no overlapping portions between images sensed by the cameras corresponding to neighboring plane mirrors, the conventional method that reduces hue or brightness discontinuities using the overlapping portions upon compositing images cannot be applied, and the seams of the images stand out.