3D stereo camera has two lenses. Pictures taken through the two lenses are similar but different as they are taken from two slightly different angles (like how the human eyes capture the stereo images). The idea of the panoramic stereo view is to be able to capture not just one viewing angle but an entire 360 degree view of the scene in 3D. Taking this idea one step further, one can capture the spherical panoramic video which also captures the top and the bottom viewing angles of the scene also in 3D. In this application, spherical and panoramic viewing is interchangeable since capturing the spherical view is just an extension of the panoramic view and same principles apply.
One way to record the panoramic view in 3D is to take multiple stereo cameras and simultaneously record the panoramic scene. Enough stereo cameras are needed to capture the entire scenery. Unfortunately, because there are two eyes, there are an infinite number of viewing angles from two views as those two eyes rotate around to view the surrounding.
Referring to FIG. 4, human eyes will see a stereo pair of images A1, A2 and overlapped section from position A. From position B, human eyes will see a stereo pair of images B1, B2 and overlapped section. Obviously, it would be impossible to have infinite number of cameras positioned around the circle.
One approach used to address this problem is to capture the surrounding images using a finite number of cameras, and take those images to make a stitched panorama image. Using these pair of panoramic images, interpolate the images for the in-between viewing angles. With enough number of cameras, this can approximate the stereo panoramic view. However, this simple approach creates a significant parallax error which is caused by using images that are approximate but not an exact image from that particular angle especially for objects that are closer to the cameras. Parallax error is well documented and known problem in panoramic stitching.
In order to minimize this parallax problem, a complex optical system using convex or concave lenses with or without mirrors can be used to capture more image information from different viewing. The problem, however, is that such a system can be very difficult and costly to implement or at the least results in lower resolution images. In addition, reconstructing the stereo pair for each viewing angle becomes a very difficult problem since any optical flaws in the lenses and mirrors are amplified.
Another method would be to take a 3D scan of the surrounding scene capturing not just the RGB information of the scene but the position (X, Y, and Z coordinate) of every pixel and the RGB's viewing angle of each pixel respect to the scanner. FIG. 5 shows a pixel that has the following information—RGB (color), pixel position (X, Y, Z), scanner position (Xs1, Ys1, & Zs1). There are already a number of well known ways to capture the 3D scan and therefore will not described here. Using multiple 3D scanners, multiple 3D scans can be simultaneously taken from enough viewing angles to pick up as much pixel information from the surrounding area. If enough pixel information is captured, the surrounding scene can be easily reconstructed using the position and RGB information. Unfortunately, this requires very expensive 3D scanners and significant post-processing which may work for static images but not for dynamic images. Just as importantly, reconstruction of this method results in a grainy image since it is constructed pixel to pixel and the resulting image does not have the high resolution smooth look.
In this invention, a new approach is presented of interpolating the scenes from new viewing angles using the depth information from the 3D scanning technique but using that information only to change the captured 2D high resolution images that will best approximate the image from the new viewing angle. Using stereometric triangulation technique, the depth information can also be derived from the pair of 2D high resolution images taken from two different positions without the need for a separate 3d scanner. While the stereometric solution results in less than perfect scan, this new method of interpolating is much more tolerant to any scanning errors. Therefore, this new interpolating method has a distinct advantage of being able to use a low-cost depth extraction technique while still being able to provide the highest image resolution and with minimal processing requirement. The tradeoff of this new method versus pixel-pixel 3D mapping is that there may be some distortions in the interpolated images since we are not doing exact pixel mapping of the scene to the depth information but such distortions should be kept minimal in most cases and tolerable to the human brain.