Image-based rendering (IBR) is the process of synthesizing new “virtual” views from a set of “real” views. Obviating the need to create a full geometric 3D model, IBR is relatively inexpensive compared to traditional rendering while still providing high photorealism. Because IBR rendering time is independent of the geometrical and physical complexity of the scene being rendered, IBR is also extremely useful for efficient rendering of both real scenes and complex synthetic 3D scenes. Therefore, IBR has attracted a lot of research interest recently. Its applications can be found in many areas such as 3DTV, free-viewpoint TV, telepresence, video conferencing, and computer graphics.
Depth IBR (DIBR) combines 2D color images with per-pixel depth information of the scene to synthesize novel views. Depth information can be obtained by stereo match or depth estimation algorithms. These algorithms, however, are usually complicated, inaccurate and inapplicable for real time applications. Conventional DBIR implementations, furthermore, use images from cameras placed in a 1D or 2D array to create a virtual 3D view. This requires very expensive camera configurations and high processing resources and prevents development of real-time DIBR applications.
Thanks to the recent developments of new range sensors that measure time delay between transmission of a light pulse and detection of the reflected signal on an entire frame at once, per-pixel depth information can be obtained in real time from depth cameras. This makes the DIBR problem less computationally intense and more robust than other techniques. Furthermore, it helps significantly reduce the number of necessary cameras.
Some approaches for solving DIBR have been proposed in professional literature. McMillan with his warping method maps a point in an image to a corresponding point in another image at a different view as long as its depth value is known. However, this work considers only single views and did not take advantage of multiple views. Furthermore, warping is only the first step of the synthesis work. An additional problem is how to deal with newly-exposed areas (holes) appearing in the warped image, which will be discussed in more detail later. Some approaches to handle this problem have also been proposed. However, these approaches consider only the 1D case where the virtual camera is forced to be on the same line with real cameras and assumed that depth images are given in the same views with color images. This assumption may not be appropriate because not all depth cameras provide color information. Furthermore, standard color cameras are much cheaper and provide much higher color resolution than depth cameras. So the combination of a few depth cameras and many color cameras may be more feasible, as will be explored in more detail later. With such a configuration, the depth and color camera views will necessarily be different.
Another approach which focuses on signal processing techniques is a one-dimensional (1D) propagation algorithm developed in part by the assignees of the present application. H. T. Nguyen and M. N. Do, Image-based Rendering with Depth Information Using the Propagation Algorithm, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2005. Using depth information, surface points that correspond to pixels in the real images are reconstructed and re-projected onto the virtual view. Therefore, the real pixels are said to be propagated to the virtual image plane. Again, it is implied in the Nguyen and Do reference that color cameras and the depth (or range) camera must have the same resolution, the same location, and only the 1D case is considered.