1. Field of the Invention
The present invention relates, in general, to stereoscopic or three dimensional (3D) image generation and projection, and, more particularly, to systems and methods for producing stereoscopic images for 3D projection or display, with at least a portion of the images being generated to provide enhanced depth rendering and, in some cases, unique depth effects in the rendered in the computer generated (CG) or computer-animated images.
2. Relevant Background
Computer animation has become a standard component in the digital production process for animated works such as animated films, television animated shows, video games, and works that combine live action with animation. The rapid growth in this type of animation has been made possible by the significant advances in computer graphics software and hardware that is utilized by animators to create CG images. Producing computer animation generally involves modeling, rigging, animation, and rendering. First, the characters, elements, and environments used in the computer animations are modeled. Second, the modeled virtual actors and scene elements can be attached to the motion skeletons that are used to animate them by techniques called rigging. Third, computer animation techniques range from key framing animation where start and end positions are specified for all objects in a sequence to motion capture where all positions are fed to the objects directly from live actors whose motions are being digitized. Fourth, computer rendering is the process of representing visually the animated models with the aid of a simulated camera.
There is a growing trend toward using 3D projection techniques in theatres and in home entertainment systems including video games and computer-based displays, and, to render CG images for 3D projection (e.g., stereoscopic images), a pair of horizontally offset, simulated cameras is used to visually represent the animated models. More specifically, using 3D projection techniques the right eye and the left eye images can be delivered separately to display the same scene or images from separate perspectives so that a viewer sees a three dimensional composite, e.g., certain characters or objects appear nearer than the screen and other appear farther away than the screen. Stereoscopy, stereoscopic imaging, and 3D imaging are labels for any technique capable of recording 3D visual information or creating the illusion of depth in an image. The illusion of depth in a photograph, movie, or other two-dimensional image is created by presenting a slightly different image to each eye. In most animated 3D projection systems, depth perception in the brain is achieved by providing two different images to the viewer's eyes representing two perspectives of the same object with a minor deviation similar to the perspectives that both eyes naturally receive in binocular vision.
The images or image frames used to produce such a 3D output are often called stereoscopic images or a stereoscopic image stream because the 3D effect is due to stereoscopic perception by the viewer. A frame is a single image at a specific point in time, and motion or animation is achieved by showing many frames per second (fps) such as 24 to 30 fps. The frames may include images or content from a live action movie filmed with two cameras or a rendered animation that is imaged or filmed with two camera locations. Stereoscopic perception results from the presentation of two horizontally offset images or frames with one or more object slightly offset to the viewer's left and right eyes, e.g., a left eye image stream and a right eye image stream of the same object. The amount of offset between the elements of left and right eye images determines the depth at which the elements are perceived in the resulting stereo image. An object appears to protrude toward the observer and away from the neutral plane or screen when the position or coordinates of the left eye image are crossed with those of the right eye image (e.g., negative parallax). In contrast, an object appears to recede or be behind the screen when the position or coordinates of the left eye image and the right image are not crossed (e.g., a positive parallax results).
Many techniques have been devised and developed for projecting stereoscopic images to achieve a 3D effect. One technique is to provide left and right eye images for a single, offset two-dimensional image and displaying them alternately, e.g., using 3D switching or similar devices. A viewer is provided with liquid crystal shuttered spectacles to view the left and the right eye images. The shuttered spectacles are synchronized with the display signal to admit a corresponding image one eye at a time. More specifically, the shutter for the right eye is opened when the right eye image is displayed and the liquid crystal shutter for the left eye is opened when the left eye image is displayed. In this way, the observer's brain merges or fuses the left and right eye images to create the perception of depth.
Another technique for providing stereoscopic view is the use of anaglyph. An anaglyph is an image generally consisting of two distinctly colored, and preferably, complementary colored, images. The theory of anaglyph is the same as the technique described above in which the observer is provided separate left and right eye images, and the horizontal offset in the images provides the illusion of depth. The observer views the anaglyph consisting of two images of the same object in two different colors, such as red and blue-green, and shifted horizontally. The observer wearing anaglyph spectacles views the images through lenses of matching colors. In this manner, the observer sees, for example, only the blue-green tinted image with the blue-green lens, and only the red tinted image with the red lens, thus providing separate images to each eye. The advantages of this implementation are that the cost of anaglyph spectacles is lower than that of liquid crystal shuttered spectacles and there is no need for providing an external signal to synchronize the anaglyph spectacles. In other 3D projection systems, the viewer may be provided glasses with appropriate polarizing filters such that the alternating right-left eye images are seen with the appropriate eye based on the displayed stereoscopic images having appropriate polarization (two images are superimposed on a screen, such as a silver screen to preserve polarization, through orthogonal polarizing filters). Other devices have been produced in which the images are provided to the viewer concurrently with a right eye image stream provided to the right eye and a left eye image stream provided to the left eye. Still other devices produce an auto-stereoscopic display via stereoscopic conversion from an input color image and a disparity map, which typically is created based on offset right and left eye images. While these display or projection systems may differ, each typically requires a stereographic image as input in which a left eye image and a slightly offset right eye image of a single scene from offset cameras or differing perspectives are provided to create a presentation with the appearance of depth.
With the recent growing surge in development and sale of 3D projection systems and devices, there is an increased demand for high quality stereoscopic images that provide high quality and pleasant viewing experiences. One challenge facing stereographers is how to create an aesthetically appealing image while avoiding the phenomenon of “cardboarding,” which refers to a stereoscopic scene or image that appears to include a series of flat image planes arrayed at varying depths (e.g., similar to a pop-up book). Rendering of left and right eye images is performed using linear depth processing using ray casting or ray tracing techniques that involves following a straight line connecting objects, light sources, and the simulated stereo cameras. CG images rendered with linear depth variation throughout the scene provides a real world view, but such rendering can produce cardboarding due to various combinations of lens focal lengths selected for the cameras and staging of the scene being imaged by the cameras. For example, there are generally trade offs between a viewer's comfort (e.g., limiting parallax to acceptable ranges) and cardboarding problems.
Another problem that arises in the staging and later rendering of a stereoscopic image is wasted space. The storytelling space for a stereographer includes the screen plane (i.e.; at zero pixel shift), screen space into or behind the screen, and theater space toward the viewer or audience from the screen plane. The theater space is used by creating crossed or negative parallax while the screen space is used by creating divergent or positive parallax in the stereoscopic images. The total display space may be measured in pixels and is often limited to less than about 70 pixels in total depth. Wasted space occurs when a long lens is used for the cameras or when a foreground figure is ahead of an object with a normal lens. In these cases, there often is a relatively large amount of depth (e.g., large percentage of the 70 available pixels) located between a foreground figure and objects or environment elements located behind the foreground figure or object. Simply adding overall stereoscopic depth (e.g., increasing the storytelling depth parameter) to a scene often is not productive because it results in undesirable results such as excessive positive and/or negative parallax (e.g., the added depth can cause the amount of parallax to exceed limits of parallax at which images viewed can be comfortably fused). Some efforts to eliminate or limit the wasted storytelling space have included multi-rigging or using multiple camera pairs for each or several select objects to give better depth or volume to the CG image. For example, one camera rig or pair may be focused on a foreground figure while another is focused on a background object, and the resulting CG image levels are composited or combined to form the final CG image. The result can be a better rounded foreground figure e.g., less cardboarding) and less wasted space.
Using multiple camera pairs is relatively complex in some environments and is not always a useful solution because it does not produce acceptable results if there is a physical connection between the two objects that are the focus of the camera pairs. If both objects are shown to be touching the ground, disconnects or unwanted visual artifacts are created during compositing and rendering of the CG image such as where the ground contacts one or both of the objects. A limitation of multi-rig techniques is that they depend upon being able to divide the scene into non-interconnected image levels since the depth tailoring offered by this technique creates a discrete set of linear depth functions and does not allow for seamless transitions blending between the depth functions.