Stereoscopic imaging is the process of visually combining at least two images of a scene, taken from slightly different viewpoints, to produce the illusion of three-dimensional depth. This technique relies on the fact that human eyes are spaced some distance apart and do not, therefore, view exactly the same scene. By providing each eye with an image from a different perspective, the viewer's eyes are tricked into perceiving depth. Typically, where two distinct perspectives are provided, the component images are referred to as the “left” and “right” images, also know as a reference image and complementary image, respectively. However, those skilled in the art will recognize that more than two viewpoints may be combined to form a stereoscopic image.
In 3D post-production, VFX workflow and 3D display applications, an important process is to infer a depth map from stereoscopic images consisting of left eye view and right eye view images to create stereoscopic motion pictures. For instance, recently commercialized autostereoscopic 3D displays require an image-plus-depth-map input format, so that the display can generate different 3D views to support multiple viewing angles.
The process of inferring the depth map from a stereo image pair is called stereo matching in the field of computer vision research since pixel or block matching is used to find the corresponding points in the left eye and right eye view images. Depth values are inferred from the relative distance between two pixels in the images that correspond to the same point in the scene.
Stereo matching of digital images is widely used in many computer vision applications (such as, for example, fast object modeling and prototyping for computer-aided drafting (CAD), object segmentation and detection for human-computer interaction (HCI), video compression, and visual surveillance) to provide three-dimensional (3-D) depth information. Stereo matching obtains images of a scene from two or more cameras positioned at different locations and orientations in the scene. These digital images are obtained from each camera at approximately the same time and points in each of the image are matched corresponding to a 3-D point in space. In general, points from different images are matched by searching a portion of the images and using constraints (such as an epipolar constraint) to correlate a point in one image to a point in another image. The matched images and depth map can then be employed to create stereoscopic 3D motion pictures.
One of the main problems of the current stereoscopic 3D motion pictures is that the audience may feel eyestrain after some time of watching the motion pictures. Therefore, when directors make 3D films, they have to consider how to shoot the scene or edit the film in such a way that the eyestrain felt by audiences can be minimized. This is part of the reason that making 3D motion pictures is much more difficult and time-consuming than making conventional 2D motion pictures.
The challenge in making 3D motion pictures is that it's very difficult for directors or editors to visually estimate the potential eyestrain felt by audiences. There are several factors contributing to this difficulty. First, the director or editor has to watch a 3D motion picture long enough to feel eyestrain because eyestrain is an accumulative effect along the process of watching the motion picture. Eyestrain is usually not caused because of a small number of segments. Second, eyestrain could also be caused by abrupt depth changes between two segments. It is difficult for editors to measure the potential eyestrain caused by the abrupt depth changes when they concatenate segments during editing. They would need to use a time-consuming trial-and-error process to concatenate different segments and “feel” the potential eyestrain caused by depth transition.
Therefore, a need exists for techniques that can measure the potential eyestrain felt while viewing a 3D presentation such as a stereoscopic motion picture. Furthermore, there is a need for automatic systems and methods that can measure the potential eyestrain during the process of editing a 3D motion picture.