1. Field of the Invention
The present invention relates to methods, apparatuses and programs for generating images combining real images and virtual images such as computer graphics images (CG images).
2. Description of the Related Art
A so-called mixed reality (MR) technique provides mixed reality including a CG image combined with a real space image (i.e., a background), according to which a user can feel as if a virtual object is present in the real space. The MR technique can be used in the fields of communication, traffic, recreation, and other various industries.
To smooth the processing for combining a virtual space with the real space and prevent a user from experiencing any sense of incongruity, the following consistencies are desirable in the MR technology:    (1) geometrical consistency for mixing an object of the real space with an object of the virtual space while maintaining a correct spatial relationship;    (2) optical consistency for naturally mixing a light source of the real space with a light source of the virtual space; and    (3) temporal consistency for equalizing the time of the real space with the time of the virtual space.
To achieve these consistencies, it is important to correctly recognize the information and state of the real space and accurately input the information of the real space to the virtual space.
Various systems have been conventionally proposed to solve the optical consistency (i.e., one of the above-described consistencies).
The system discussed in literature 1 (I. Sato, y. Sato, and K. Ikeuchi, “Acquiring a radiance distribution to superimpose virtual objects onto a real scene,” IEEE Transactions on Visualization and Computer Graphics, Vol. 5, No. 1, PP. 1-12, January-March 1999) obtains an omni-directional image of a real space captured by a fish-eye camera, estimates position information of a light source in the real environment from the obtained omni-directional image, and reflects the estimated position information of the light source on a virtual space.
Furthermore, the system discussed in literature 2 (M. Kanbara, T. Iwao, and N. Yokoya, “Shadow Representation for Augmented Reality by Dynamic Shadow Map Method”, a lecture memoir for an image recognition and comprehension symposium (MIRU2005), pp. 297-304, July 2005) realizes a real-time estimation of the light source environment (the positions of light sources of the real world) based on a three-dimensional marker which combines a two-dimensional square marker and a mirror ball placed on the center of the marker, and disposes virtual light sources based on the estimated positions.
According to the literature 2, the system calculates a relative position between a viewpoint and a virtual object based on the position and orientation of the two-dimensional square marker. Then, to estimate a light source environment, the system causes a camera mounted on a video see-through head mounted display (HMD) to capture a light source of the real environment reflected in the mirror ball (i.e., a highlight region of the mirror ball) based on the calculated relative position.
The system discussed in the literature 1 can obtain light source information of the real environment in a single image-capturing operation. However, the system cannot perform a real-time estimation of light source environment in response to a change in an illumination environment, and cannot reflect the change in the illumination environment on the virtual space.
The system discussed in the literature 2 performs a real-time estimation of light source information and can reflect the estimation result on the virtual environment. However, a user is required to constantly capture the two-dimensional marker in a field of view and is also required to prepare a complicated marker (i.e., a two-dimensional marker including the two-dimensional marker and mirror ball) beforehand.