The present invention relates to the processing of digital stereoscopic video signals with accurate rendering of three-dimensional (3D) effects.
In the stereoscopic three-dimensional viewing of a scene, distinct but similar images are presented to the left and right eyes. Disparities between the observed left and right images act as depth cues to the human visual system (HVS), creating the illusion of depth in the perceived image when the HVS combines the left eye and right eye images in the visual cortex. Extending this idea to video, when a time varying sequence of left-eye images and right-eye images is rapidly presented with appropriate disparities between corresponding left-eye and right-eye images, an illusion of depth in a moving scene can be created.
Various stereoscopic 3D display technologies exist, or can be envisaged, that present over a given time period a sequence of correlated image pairs to the left and right eyes.
In some stereoscopic 3D display technologies isolated images are displayed separately to the left and right eyes using independent display systems for instance using head-mounted displays. In general, such systems are only suitable for a single viewer.
In some technologies, the left-eye and right-eye images are displayed simultaneously by being merged into a single image seen by both eyes. Filters respectively placed in front of the two eyes then extract the relevant images from the merged image. The extraction of the images intended for the left and right eyes can be based on frequency separation, like in the so-called Dolby-3D system. Another technology uses different polarization states for the two eyes, like in the so-called RealD system.
On the other hand, in frame-sequential 3D systems, images intended for the left eye are displayed at one time and images intended for the right eye are displayed at another time. The display system alternates between the display of left-eye and right-eye images. During the display of the left-eye image, the path to the right eye is blocked and likewise during the display of the right-eye image, the path to the left eye is blocked. Thus, each eye sees its intended image sequence and sees blackness when an image intended for the other eye is being displayed. For maximum viewer comfort, the system alternates between image and blackness at a sufficiently high rate such that the viewer does not perceive flicker.
In such frame-sequential 3D systems, the blocking of light may be achieved by active eyewear, for example eyewear embedding a pi-cell (optically compensated bend mode LCD surface mode device with parallel rub direction) into each lens of the eyewear. The pi-cell is alternately switched between clear and opaque, synchronously to the frame rate of the television set. Therefore, if the TV alternately supplies left-eye and right-eye images, the active eyewear can steer the corresponding image to each eye, creating the 3D stereoscopic effect.
Alternatively, in another type of frame-sequential 3D system, the blocking may be achieved by passive eyewear incorporating polarizing filters and a switchable polarizer on the display device that can be switched between two opposed polarization states.
Thus, stereoscopic display systems can generally be categorized as (1) simultaneous-type, i.e. a left-eye image and a right-eye image are displayed at the same instant, or (2) staggered-type, i.e. the display shows a left-eye image followed by a right-eye image followed by a left-eye image, etc.
Likewise, two types of video recording apparatus can be distinguished. The first type of apparatus shoots two images at the same time for the left and right eyes, for example using two separate optics spaced apart to account for the stereo effect and supplying incoming light to two respective sensors sampled simultaneously. In the other type, the left-eye and right-eye images are captured at different times, for example using a single light sensor and a prism or mirror in the optical path to switch back and forth between two viewpoints providing the stereo effect. When the 3D video is made of synthesized images, the left-eye and right-eye images are generally built as representations of a 3D scene at the same instant.
Due to the various technologies used for generating and displaying stereoscopic video signals, there can be a slight discrepancy in the time sampling structure between capture and display. For an image sequence representing a static scene, the different time sampling structures between the left and right image sequences create no problem. For a dynamically changing scene, the movement of objects within the scene is slightly changed. However, the perception of movement by the viewer is almost unaffected because the fluctuation in the object speeds is hardly noticeable at the usual frame rates of video contents.
It would be desirable to optimize the rendering of 3D effects in stereoscopic video applications, which is currently based on spatial disparities between the left-eye and right-eye image sequences.