As an important fundamental problem of computer vision techniques, image salient object segmentation is attracting interests and attention of current researchers increasingly. In a process of image salient object segmentation, the most critical step is to use visual attributes to highlight salient objects and suppress non-salient objects. However, for a complex scenario, it is not clear which visual attribute is capable of persistently highlighting the salient objects, in addition, for a case where the salient objects and the non-salient objects have the same visual attributes, it is not clear how to segment and distinguish them from each other correctly. Therefore, it is necessary to investigate what is and what is a not salient object before researching and developing a salient object segmentation model.
During the past ten years, many research works have been done to make a comprehensive and convincing definition of the salient object. For instance, Jiang et al. published a paper in the CVPR conference of 2013 to propose that the salient object is characterized by uniqueness, focus degree and objectiveness in common. In the work published by Cheng et al. in the CVPR conference of 2013, the salient object was considered to be unique, and have a compact spatial distribution. In the works published by Goferman et al. in the TPAMI of 2012, the salient object was considered to have a unique distinction compared with a local or a global peripheral context image. Based on these findings, salient object segmentation models are proposed in many researches, which are used to determine an image region, a superpixel or a pixel-level saliency by designing different heuristic features. Generally, these salient object segmentation models have achieved good performance in simple scenarios that are clearly distinguishable, however, for complex scenarios, salient objects and non-salient objects usually have common visual attributes, making it difficult for the segmentation models to correctly distinguish the salient objects from the non-salient objects.
At present, by using a large-scale image benchmark dataset, a sufficiently complex image salient object segmentation model may be trained, for instance, in 2015, He el al. used a deep neural network to train the salient object segmentation model in a superpixel level in the IJCV, in 2016, Liu el al. proposed to use a recurrent neural network to obtain a hierarchical saliency segmentation model in the CVPR. These models may partially solve a problem presenting in a complex scenario, but training of these models is very difficult, and it is not easy to satisfy the requirement of a large number of trained benchmark image data. In addition, as a matter of fact, it is not clear which part of visual attributes contributes the most to the distinction between the salient objects and the non-salient objects due to the “black box” nature of deep learning techniques such as the deep neural network, the recurrent neural network and the like.
Therefore, exploration of respective essence of a salient object and a non-salient object not only has enlightening significance to the designing of visual attribute descriptions characterizing a candidate object set, but also has guiding significance to the construction of an image salient object segmentation model capable of being adaptive to various complex scenarios.