Over the last decade a substantial amount of research has been directed at the realization of 3D display technology for use in and around the home. As a result there has been a flurry of both stereoscopic as well as autostereoscopic displays.
In stereoscopic displays, the eyes of the viewer are generally assisted by e.g. glasses or foils positioned between the viewers eyes and the display to direct in a time-multiplexed or simultaneous manner (e.g. through spectral separation) a left eye image to the left eye of the viewer and a right eye image to the right eye of the viewer. As generally users find wearing glasses bothersome, autostereoscopic displays have likewise received considerable attention. Autostereoscopic displays, often multi-view displays, generally allow the visualization of multiple; e.g. 5-9 or more images or views which are multiplexed in a viewing cone directed at the viewers. By observing separate views from a cone with the left and right eye respectively a stereoscopic effect is obtained by the unaided eye.
An important issue for both stereoscopic displays and autostereoscopic displays is the delivery of content. Various approaches are known to deliver three-dimensional content to a display. Some of these approaches explicitly encode all views, whereas others encode one or some views and additional depth and/or disparity information for one or all of these views. The advantage of providing depth information is that it facilitates manipulation of the three-dimensional content e.g. when rendering additional views based on the images provided.
Although such depth information can be acquired through analysis of e.g. disparity of stereo images, or using range finders, this is generally only possible for newly acquired content. Moreover stereo or multiview acquisition generally also comes at a higher cost. As a result there has been substantial research directed towards the acquisition of depth information from monocular images, or monocular image sequences. A variety of applications of such algorithms may be envisaged ranging from fully automated conversion, to user assisted 2D to 3D conversion for high quality content. In case of user assisted 2D to 3D conversion computer assisted depth map generation may represent a substantial time saving.
An example of an approach to obtain a depth map from a monocular image is presented in “Depth Map Generation by Image Classification”, by S. Battiato, et al, published in Proceedings of SPIE Electronic Imaging 2004, Three-Dimensional Image Capture and Applications VI—Vol. 5302-13.
In the above paper a depth map is generated by combining a depth map based on an estimated global depth profile of an image which is then combined with a further depth map which comprises image structure. The resulting combination however does not always provide for a satisfactory depth perception in the resulting image.