1. Field
Example embodiments relate to a video processing method, and more particularly, to a video processing method for a three-dimensional (3D) display based on a multi-cue process.
2. Description of the Related Art
Recently, a three-dimensional (3D) display market has been rapidly expanding in various fields including the medical business, education, the entertainment business, the manufacturing business, and the like. Consumers may use a great number of 3D documents, in particular, 3D films. Thus, the 3D display market is expected to expand more rapidly in the years to come.
In the movie industry, numerous 3D films have been produced each year. However, most of the produced 3D films may correspond to image documents taken by a single camera and stored in a two-dimensional (2D) format. Since a monocular 2D video may not have depth information corresponding to an object photographed by a camera, a 3D image may not be directly displayed.
Thus, a huge potential of the 3D display market may enable a technology of converting a 2D image to a 3D image, to command attention from people in a related field.
Existing processes and technologies of converting a 2D image to a 3D image, for example, TRIDEF 3D EXPERIENCE of Dynamic Digital Depth (DDD) Inc., may comply with a similar process. After a likelihood depth map is estimated from an input video sequence, a 3D vision may be composed by combining a video with the likelihood depth map. To recover depth information of a video scene, the video may be analyzed using various depth cues, for example, a shadow, a motion estimation, a texture pattern, a focus/defocus, a geometric perspective, and a statistical model. Even though a conventional converting process may have an obvious effect, a practical application has not been prepared for the following reasons. A first reason may be based on an extreme assumption that a depth cue may have a favorable effect only with respect to a predetermined visual scene, and the predetermined visual scene may correspond to a video document having general interference. Secondly, it may be difficult to generate a consistent depth result by combining various cues. Thirdly, it may be inappropriate to recover a depth from a monocular image or a video. On some occasions, a visual depth may not be measured without multi-angle information to be used.
A saliency image may visually indicate an intensity of a visual scene. The saliency image has been studied for over a couple of decades in a brain and visual science field.
FIG. 1 illustrates an exemplary visual scene and a related saliency image. As illustrated in FIG. 1, a brightness region of the saliency image may indicate an object for commanding attention from an observer. Since the saliency image may provide relatively valuable information in a scene having a low level, the saliency image is being widely used in a great number of mechanical version processes, for example an automatic target detection and a video compression.
However, an existing technology using a saliency may not be applied to a conversion from a 2D image to a 3D image. Even though a saliency image generated through an existing process may sufficiently express an important object in a scene, the saliency image may have the following drawbacks.
A block shape may appear, saliency information may not accurately conform to a boundary of an object, a relatively large object may appear significantly brightly, and an overall object may not be filled.
A further drawback may be only a static characteristic, for example, an intense/saturation, a brightness, and a location may be processed, and a dynamic cue, for example, an object in motion and a person, providing importance visual information in a video document may not be processed.