Three-dimensional (3D) imaging is a technique of creating the illusion, of depth in an image so that the depth is perceived by a viewer. With stereoscopic imaging, the illusion of depth (e.g., for a two-dimensional (2D) image, photograph, or movie) can be created by presenting a slightly different image to each eye for the scene depicted within the media. Typically, for the viewer to perceive the depth of the media, the user must view the stereoscopic images through, some type of special viewing apparatus, such as special headgear or glasses. Auto-stereoscopic imaging, in contrast to stereoscopic viewing, is a technique of displaying 3D images that can be viewed without the use of any special viewing apparatus.
While the media industry has made advances in 3D imaging, there still exist many challenges with efficiently and accurately extracting objects from the image and properly creating depth information for the objects. Color segmentation is a deductive process that can be used to extract large homogeneous regions based on color and/or texture. Color segmentation takes the original 2D image, which can have hundreds or thousands of colors, and narrows down the number of colors in the 2D image to a smaller sub-set of different colors. The resulting color-segmented images can be used to generate a depth image that is representative of the depth information for each pixel or object within the image.
Additionally, to speed up the computation time required to generate 3D images from 2D images, one solution is to employ automatic rotoscoping. Rotoscoping refers to the process for drawing out objects within an image. In its most traditional form, rotoscoping referred to creating a matte (which is used to combine two or more image elements into a single, final image) for an element on a live-action plate so the element can be composited over another background. Bezier curves can be employed to automatically define 2D curves by evaluating an object at variously spaced points and then converting the approximating sequence of line segments to represent the 2D outline of the object. However, by only using Bezier curves, the object is often only “loosly” outlined due to the desire to limit the number of Bezier points to increase efficiency of the 3D rendering (e.g., the less points that need to be transposed from frame to frame, the quicker the processing time since less points need to be manipulated). Therefore, it is advantageous to use as few Bezier points as possible, which results in only a coarsely-traced outline.
Depth maps can be generated to indicate what areas of objects within the 2D images are closer to the viewer or are further away from the viewer. While depth maps can be generated based on a single 2D image, depth maps are often generated using stereo pairs (a pair of images taken by a corresponding pair of cameras, where the cameras are configured such that two different vantage points are captured by each camera and there is a known relationship between the cameras). In order for the depth map to be generated, a common point between the images is often required to correlate the vantage point information properly. For example, fragments (e.g., a predetermined pixel square) can be compared between the images to determine the common point. However, this process is extremely dependent on the accuracy of the selection of a common point within the two images, which is a non-trivial, time-consuming selection. Therefore, while various techniques have been adapted to create 3D images to increase efficiency and speed, the process is still time-consuming, complicated, and expensive.