Three dimensional (3D) videos, also known as stereoscopic videos, are videos that enhance the illusion of depth perception. 3D movies have existed in some form since the 1950s, but 3D-at-home has only recently begun to gain popularity. One bottleneck inhibiting its adoption is that there is not yet a sufficient amount of suitable 3D content available and few live broadcasts are viewable in 3D. This is because the creation of stereoscopic content is still a very expensive and difficult process. Filming in 3D requires highly trained stereographers, expensive stereo rigs, and redesign of existing monoscopic content work-flows. As a result, techniques for converting 2D content into 3D are required, both for new productions as well as conversion of existing legacy footage.
The general problem of creating a high quality stereo pair from monoscopic input is highly under-constrained. The typical conversion pipeline consists of estimating the depth for each pixel, projecting them into a new view, and then filling in holes that appear around object boundaries. Each of these steps is difficult and, in the general case, requires large amounts of manual input, making it unsuitable for live broadcast. Existing automatic methods cannot guarantee quality and reliability as necessary for television (TV) broadcast applications.
Converting stereoscopic video from monoscopic video for live or existing broadcast data is a difficult problem, as it requires the use of a view synthesis technique to generate a second view, which closely represents the original view. One reason why the conversion is difficult is that it requires some knowledge of scene depth. As a result, existing conversion methods use either some form of manual input (such as user-specified normal, creases and silhouettes), manual tracing of objects at key frames in a video, or some prior scene knowledge.
Some methods of automatic stereoscopic video conversion from monoscopic video typically work by reconstructing a dense depth map using parallax between frames, or structure from motion. Unfortunately, however, these methods require static scenes and specific camera paths, and in cases where parallax does not exist in a video sequence, such as with a rotating camera, these methods would not work.
It would be desirable to provide automated conversion techniques which produce high quality stereoscopic video from monoscopic video inputs without the need to assume static content.