1. Field of the Invention
The present invention relates to a technique of encoding images from a plurality of viewpoints.
2. Description of the Related Art
Recently, digital cameras and digital video cameras each equipped with a plurality of lenses have made their market debut. These cameras aim at capturing images from different viewpoints under different capturing conditions for respective lenses, and providing the user with a high-quality, advanced-function image using the captured images. One purpose is to generate an HDR (High Dynamic Range) image by compositing a plurality of images having different exposure times from almost the same viewpoint. According to this technique, the exposure time is changed between lenses to capture an image in which a dark region of an object becomes clear and an image in which a bright region of the object becomes clear. Then, a plurality of captured images are composited, generating an image in which the entire object becomes clear.
When generating an HDR image, there are a system which performs processes from capturing up to composition inside a camera, and a system which saves captured images and composites images the user wants by a PC or the like after capturing.
The former system has an advantage in that it can generate an HDR image by a single camera, but the circuit which composites images costs. Further, this system cannot meet requests from the user, such as a request that he wants to see a dark region of an object more clearly after capturing, and a request that he wants to see a bright portion more clearly. In contrast, the latter system cannot composite images into an HDR image by a single camera, but can generate an image requested by the user after capturing.
However, when compositing images into an HDR image after saving a plurality of captured images, like the latter system, the data amount becomes large. The data amount increases especially for a moving image. Moving images from a plurality of viewpoints have an enormous data amount, and it is difficult to save the data in a memory card or transmit it to a server or the like via a communication network. As the number of viewpoints increases, the data amount increases.
To solve this problem, techniques for encoding images from respective viewpoints and compositing them into an HDR image have been disclosed (patent literature 1 (Japanese Patent Laid-Open No. 2003-299067) and patent literature 2 (Japanese Patent Laid-Open No. 2006-54921)). There is also disclosed a technique of reducing the data amount by using H.264 MVC (Multi View Coding) which performs inter-viewpoint prediction to predict images from a plurality of viewpoints between the viewpoints (patent literature 3 (Japanese Patent Laid-Open No. 2009-536793)).
However, the techniques disclosed in patent literatures 1 and 2 perform encoding for each viewpoint and do not perform it between viewpoints, so the code amount increases.
In patent literature 3, prediction between viewpoints is performed, but the central viewpoint is used as a reference viewpoint. Thus, when the exposure time of the central viewpoint greatly differs from those of other viewpoints, the prediction residual becomes large, increasing the code amount. The reference viewpoint is a base view in H.264 MVC. When encoding the reference viewpoint, inter-viewpoint prediction from other viewpoints is not executed. For the other viewpoints, inter-viewpoint prediction is performed using the reference viewpoint, or inter-viewpoint prediction is performed further using a viewpoint having undergone inter-viewpoint prediction using the reference viewpoint. In, for example, an image in which the exposure time of the central viewpoint greatly differs from those of other viewpoints, and a shadow detail loss or highlight detail loss occurs, if the central viewpoint is set as a reference viewpoint, the prediction residual becomes large, increasing the code amount.