A. Technical Field
The present invention pertains generally to camera or vision systems; and relates more particularly to generating novel views of a scene.
B. Background
Novel view interpolation (NVI) deals with trying to interpolate or synthesize a new, or novel, image view from other views. There are a wide range of approaches to synthesize new views, which may be classified into three general categories: NVI without geometry, NVI with explicit geometry, and NVI with implicit geometry.
Light field rendering belongs to the first category. It makes no assumption of the scene geometry, but a large number of cameras are used to capture the input images, which limits its application.
The methods in the second category produce a virtual view by projecting pixels from all of the reference images. Therefore, the methods of this category require accurate geometry to synthesize the novel view. The typical methods of this category include view-dependent texture-mapping, 3D warping, layered-depth images, and wide-baseline stereo. These methods generally adopt stereo matching to obtain the accurate geometry, which is a significant challenge in the field of stereo vision.
NVI with implicit geometry tries to find a trade-off between the first and second categories, demanding less images and requiring less accurate geometry. The novel view and its depth are simultaneously estimated in the methods of this category. Methods of this category model NVI as a maximum likelihood estimation (MLE) problem. Because it is poorly constrained, a powerful prior is needed to obtain a good solution. For example, a texture dictionary has been used as the prior in a Markov Random Field (MRF) model. This work has been extended by using different priors, field of experts, and pairwise dictionaries. These methods have the disadvantage of the independent assumption over the observed data. Conditional Random Field (CRF)-based NVI methods have been suggested to remove this limitation. These methods appear to yield good results, but the input images are always of high quality. Current algorithms of this category tend to focus on the occlusion problem, with some attention on the effect of large view changes on NVI. No research work has been explored in the other complex scenes, for example, radiometric variation, textureless, and non-Lambertian surfaces.
Although there are several novel view interpolation algorithms that attempt to extract information from other views in order to generate a novel view, the several challenges presented by complex scenes have traditionally been a barrier to good novel view interpolation. These challenges include but are not limited to, ill-positioned pose, transparency within the scene, occlusion, deformation, lighting, and large view changes. Furthermore, novel view interpolation in complex scenes can suffer from several other issues, include by way of example, radiometric variation, textureless and non-Lambertian surfaces, and complicated structures (e.g., hair, trees, etc.). These difficult scenarios cannot provide reliable information for point correspondences, and typically generate a large number of false positive matches. While these specialized methods for NVI may be used to address one or two of these difficult scenarios, their inherent weaknesses make them unsuitable in other scenarios.
Because scenes can contain or suffer from one or more of these challenges, it is difficult to correctly interpolate a new view of a scene. Accordingly, what is needed is a more robust system that can generate a novel view of a scene.