1. Field
One or more embodiments of the following description relate to an apparatus, a method and a computer-readable medium generating a depth map, and more particularly, to an apparatus a method and a computer-readable medium automatically generating a depth map corresponding to a two-dimensional (2D) image in each frame of a video among general videos.
2. Description of the Related Art
Recently, three-dimensional (3D) TV is emerging as a popular issue in both research fields and business markets. A 3D TV is different from a conventional two-dimensional (2D) TV in that it can display stereo video. Viewers may feel a depth effect as if they are watching a real 3D scene. The depth effect is based on a human's binocular vision model. People see the real world using both eyes, which may receive different images when viewing a 3D scene. By independently projecting two different images to each eye, people can reconstruct a 3D scene in their mind.
However, currently most media, such as films or video, and image capturing devices, such as digital cameras or film cameras, are still based on a mono system using a single camera. When the media are directly displayed on a 3D TV, 3D effects may not be shown. To convert the media to 3D video, one solution is to hire many workers to manually label a depth map of each region in video. The conversion result is quite satisfying, however, not easily achievable because too much manpower is required.
Even though solutions have been already provided, the solutions are limited for use in general video sequences. For example, one solution is to provide a depth display system requiring a computer interaction. However, in this example, it is difficult to completely realize unmanned monitoring of a 3D TV, and to operate the 3D TV in real time since a user's inputs is required. Alternatively, assuming that only an object in an image is horizontally moved and a background of the image stops, a stereo video difference may be simulated based on a motion visual difference. However, it is difficult to process a general video in real-time, and as a result there are limits in processing of video.