As visual saliency may be utilized to simulate human visual attention mechanism, it has received a wide attention in the research field, and has become a research subject in neuroscience, robotics, computer vision and other fields. Identification of salient regions may be applied to object recognition, image relocation, visual tracking and image segmentation, and also may be applied to analysis on selection of human fixation. Currently, methods for extracting saliency are generally oriented to a single two-dimensional image. With the development of information science and technology, an increasing number of stereoscopic images abound in ordinary lives, which poses a challenge to the technology of extraction of saliency. Based on researches of saliency of two-dimensional images, saliency of a stereoscopic image should be analyzed and sampled in a different way, new factors and features should be considered, and stereoscopic saliency should be estimated comprehensively.
In 2012, C. Lang provided a theoretical basis and a method for extracting a depth feature. Firstly, four conclusions are derived from a large number of experiments on fixation tracking of 2D images and 3D images: (1) a depth feature tends to modulate visual saliency to a farther range of depth, however, human visual systems are still more likely to focus on scenes with relatively small depths; (2) a large number of fixations are landed on a small number of objects of interest, and this characteristic applies to both 2D and 3D images; (3) nonlinear changes presents between depth features and saliencies; (4) with depth information accumulates, changes of distribution of fixations between 2D and 3D will increase, especially for images which has salient stimuli in different ranges of depth. The four conclusions provide important theoretical bases for applying extraction of depth features on detection of salient objects, and demonstrate that existence of depth features will significantly affect the size and distribution of visual saliency in an image. In the literature, after the four conclusions have been reached, depth features are extracted by fitting a Gaussian probability density function of the depth of decomposition, and a saliency algorithm for stereoscopic images is obtained by combining a 2D saliency algorithm. In 2013, K. Desingh further optimized and expanded C. Lang's study, and obtained a three-dimensional algorithm by adding to experiments a test of blurred images in backgrounds with high depths and central bias, extracting depth features by utilizing a global contrast-based idea of M. M. Cheng and point cloud segmentation technology, and combining a two-dimensional saliency algorithm. In 2012, Niu proposed two methods, of which one is an estimation and comparison method based on global parallax (CSS), and the other is a method based on stereoscopic rules. He combined the two methods in his literature to extract saliency of stereoscopic images. However, their accuracies in detecting salient regions are all not high enough.