The present invention relates in general to video clips and in particular to methods and systems for generating three-dimensional (“3-D”), or stereoscopic, video clips with improved depth control.
Human beings normally see the world using stereoscopic vision. The right eye and the left eye each perceive slightly different views of the world, and the brain fuses the two views into a single image that provides depth information, allowing a person to perceive the relative distance to various objects. Movies filmed with a single camera do not provide depth information to the viewer and thus tend to look flat.
Achieving depth in a motion picture has long been desirable, and 3-D movie technology dates back a century. Most of the early efforts used anaglyphs, in which two images of the same scene, with a relative offset between them, are superimposed on a single piece of movie film, with the images being subject to complimentary color filters (e.g., red and green). Viewers donned special glasses so that one image would be seen only by the left eye while the other would be seen only by the right eye. When the viewer's brain fused the two images, the result was the illusion of depth. In the 1950s, “dual-strip” projection techniques were widely used to show 3-D movies: two films were projected side-by-side in synchronism, with the light from each projector being oppositely polarized. Viewers wore polarizing glasses, and each eye would see only one of the two images. More recently, active polarization has been used to distinguish left-eye and right-eye images. Left-eye and right-eye frames are projected sequentially using an active direction-flipping circular polarizer that applies opposite circular polarization to the left-eye and right-eye frames. The viewer dons glasses with opposite fixed circular polarizers for each eye, so that each eye sees only the intended frames. Various other systems for projecting 3-D movies have also been used over the years.
Unlike 3-D projection technology, the camera positioning techniques used to create 3-D movies have not changed significantly over the years. As shown in FIG. 1A, in one conventional technique, two cameras 102 and 104 are set up, corresponding to the left eye and right eye of a hypothetical viewer. Each camera 102, 104 has a lens 106, 108 with a focal length f and a film back 110, 112 positioned at a distance f from lenses 106, 108. Lenses 106 and 108 each define an optical axis 111, 113. Cameras 102 and 104 are spaced apart by an “interaxial” distance di (i.e., the distance between optical axes 111, 113 as measured in the plane of lenses 106, 108, as shown) and are “toed in” by an angle θ (the angle between the optical axis and a normal to the screen plane 115), so that the images converge on a point 114 at a distance z0 from the plane of the camera lenses 106, 108. When the films from cameras 102 and 104 are combined into a 3-D film, any objects closer to the cameras than z0 will appear to be in front of the screen, while objects farther from the cameras will appear to be behind the screen.
With the rise of computer-generated animation, the technique shown in FIG. 1A has also been used to position virtual cameras to render 3-D stereo images. The description herein is to be understood as pertaining to both live-action and computer-generated movies.
Three-D images generated using the technique of FIG. 1A tend to suffer from distortion. Objects toward the left or right of the image are significantly closer to one camera than the other, and consequently, the right-eye and left-eye images of peripheral objects can be significantly different in size. Such distortions can distract the viewer.
One known technique for reducing such distortions is shown in FIG. 1B. Cameras 122 and 124 are spaced apart by an interaxial distance di, but rather than being toed in as in FIG. 1A, the film backs 126 and 128 are offset from the optical axis by a distance dB as shown. Lenses 130 and 132 are oriented such that optical axes 121 and 123 are normal to screen plane 125, reducing eye-to-eye distortions. For each camera 122, 124, a film-lens axis 127, 129 is defined by reference to the center of film back 126, 128 and the center of lens 130, 132. Film-lens axes 127, 129 are effectively toed in at toe-in angle θ, and their meeting point 134 defines the convergence distance z0. This technique, which has been used for computer-generated animation, reduces eye-to-eye distortion.
Regardless of which technique is used, 3-D movies suffer from problems that have limited their appeal. For example, the interaxial distance di and toe-in angle θ are usually selected for each shot as the movie is being created. In close-up shots, for example, di and θ are normally selected to create a relatively short convergence distance z0; in wide shots, a longer z0 is usually desired. During post-processing, the director often intercuts different shots to form scenes. To the extent that di and θ are significantly different for successive shots, the viewer's eyes must discontinuously adjust to different convergence distances. Frequent discontinuous adjustments are unnatural for human eyes and can induce headaches or other unpleasant effects.
It would therefore be desirable to provide improved techniques for creating 3-D movies.