1. Field of the Invention
The present invention relates to an image processing method and an image processing apparatus for combining a virtual image with a shot image to obtain a combined image and displaying the combined image.
2. Description of the Related Art
In the video production field, the chroma key technology is used in which a specified region is extracted from a live image picked up by a video camera, and in which the specified region is combined with a computer-graphic (CG) image.
Additionally, in a field of mixed reality (MR) in which a real space and a virtual space are naturally combined together such that the user does not feel that the real space and the virtual space are different in MR, the chroma key technology is also used. In the MR field, in order to extract only an object region from a shot image and display the object region, a technology has been suggested in which information is extracted from the shot image to generate, for example, a look-up table (LUT) that is a type of parameter information used to extract the object region.
An MR apparatus combines an image obtained in the virtual space, which is rendered using computer graphics, with an image obtained in a real space, which is picked up by an image-pickup apparatus, such as a camera, to obtain a combined image. The combined image is displayed on a display device, such as a head-mounted display (HMD), thereby presenting MR to a user.
When a CG image of a virtual object is superimposed and displayed on a shot image obtained in the real space, a combined image is generated without superimposing the CG image in a region of, for example, a hand of the user included in a region of the shot image. This is performed using an object-region-extracting process. With this process, when the hand of a user wearing the HMD is extracted as an object, the hand is not hidden by the virtual object in the combined image. Accordingly, MR that can provide a feeling more similar to a feeling obtained in the real space for the user can be presented.
Next, the object-region-extracting process in the related art used to present MR to the user will be briefly described.
In order to extract an object region, color information concerning the hand is set in advance as parameter information, such as the LUT. The color information concerning the object upon which a CG image is not to be superimposed or displayed is registered, for example, in the following methods. One method is capturing an image showing the object picked up by a camera, representing values of pixels as a distribution on a CbCr plane, and specifying a coordinate range of each axis of a color space. Another method is sampling points for each axis in the color space, and registering values indicating whether or not the sampled points represent the object, which is called the LUT.
Next, a process in which the user wearing the HMD experiences an MR image obtained using the object-region-extraction process will be briefly described.
The image-pickup apparatus, which is the HMD worn by the user, picks up an image showing the region in front of the user wearing the HMD to generate a shot image. The shot image includes a background image that the CG image is to be combined with, and an image of the hand of the user.
After the shot image is captured, the CG image to be superimposed and displayed is generated. The CG image is a CG image excluding the region of the hand of the user, i.e., the object region. By superimposing the CG image, which does not include the region of the hand, upon the shot image, a combined image, upon which the CG image excluding the object region is superimposed and displayed, can be generated.
In the related art, a combined image is generated with a performance of the object-region-extracting process using a combination of the LUT, a process in which a range of luminance value is specified in a shot image, a noise reduction process, in which pixels having a region smaller than a predetermined size are recognized as noise and not rendered in a CG image, and so forth. By generating the combined image in such a manner, MR more similar to reality can be provided for a user.
As a method for setting the parameter information used to generate the combined image using the object-region-extracting process, a method in which a luminance range is specified, or a method in which a setting of the number of pixels is specified in the noise reduction process can be used. However, use of another method is necessary. The method is specifying a range of a shot image by dragging a mouse, and registering colors of pixels included in the specified region as color information specifying a region upon which a CG image is not to be superimposed or displayed in the LUT. As an example of an operation method for performing such a registration in the LUT, a method can be used in which colors are specified directly in a CbCr color space.
However, the operation method was difficult for a general user, who lacks expertise or is not familiar with how to deal with colors, to understand the contents of the operation and a display of the color space for the operation. Accordingly, a general-user-friendly operation method has been suggested. The general-user-friendly operation method is specifying the object upon which a CG image is not to be superposed or displayed, such as the hand described in the object-region-extracting process, using colors of the shot image while the shot image is being displayed, and providing a specification of the colors for an MR-presenting apparatus.
Although an operability in the registration of colors in the LUT using the operation of selecting an image region in the shot image has been improved to some extent, the operation has a disadvantage described below.
An appropriate value obtained by adjusting a parameter used to determine a region of the CG image that is not to be superimposed on the shot image or the object region is determined by using subjective sensation of the user while the user is checking a positive effect of a setting on the combined image. Accordingly, the user needs to repeat the registration and deletion of the color information of the LUT so that the parameter can be approached to the appropriate value.
With the technology disclosed in Japanese Patent Laid-Open No. 2005-228140, operations for extracting the color information from an image can be easily performed. However, because the parameter needs to be slightly adjusted as the operations are in progress, an appropriate image display is necessary in accordance with the progress of the operations. There is no image-displaying method for displaying the shot image, the combined image, and an image to be processed such that these images can be related to one another. For a slight adjustment of the parameter, the user needs to have a task of switching the operations or the displays of the images one by one.
The registration or deletion of the color information of the LUT is an operation in which a slight adjustment is necessary. However, when a combined image is generated using both settings of other parameter information and the color information of the LUT, it is necessary that the generation of the combined image be performed so that images with an overall positive effect caused by both the settings of other parameter information and the color information of the LUT can be simply applied to the combined image. However, there is no technology in which an image display can be changed on the basis of a mask region of a CG image while the combined image is being generated using the positive effect of the current settings.
When the color information is extracted as the parameter information used in the object-region-extracting process from the shot image, a user interface (UI) for user operations is used because, with the UI, the user can check a positive effect on the combined image on an image display, whereby the parameter can be approached to an appropriate value. However, because of the above-described technical problems, the UI has disadvantages described below.
There is no image-displaying method for displaying the shot image, the combined image, and an image to be processed such that these images can be related to one another in accordance with the progress of the operations for extracting the color information from an image. For an adjustment of the parameter, the user needs to have a task of continuing the operations while alternately switching and watching displays of the shot image, the combined image, and an image to be processed.
Furthermore, the user needs to have a task of performing an adjustment of the parameter while checking the overall positive effect caused by the settings of parameter information on a display of the combined image. Additionally, the user needs to have a task of manually changing a transparency parameter of a CG image in accordance with the progress of the adjustment of the parameter such that the transparency parameter can be set to an appropriate value.