Understanding images is often easy for people, but difficult for computers. A person may view a sequence of images and quickly recognize important objects in those images and relations of those objects to each other and of the images to each other. To a computer, the image is a set of data points denominated as “pixels” that are associated with values such as coordinates defining their locations and color values, often defined used a “red-green-blue” (RGB) color scheme. The computer attempts to understand the image by comparing the pixels in an image to each other or to pixels of other images. For example, if a set of pixels from another image has previously been labeled as a cat, the computer may recognize an association between that set of pixels and pixels depicted in an image that the computer is analyzing. Based on the association, the computer determines that the analyzed image includes a cat.
To understand an image without reference to other images, a computer may generate a saliency map by measuring the visual importance of the pixels comprising the image. Further analysis of the saliency map is then performed to extract a salient object from the image. Salient objects may include parts of the image that are to some degree distinct from other parts of the image. For example, in a picture of a cat sitting on a table, the cat and possibly the table may be recognized by a computer as salient objects. The use of saliency maps for images with complex objects or complex backgrounds, however, is often a difficult and ineffective way of locating salient objects. Also, the transformation of the image to a saliency map often causes image information to be lost and unavailable for salient object detection.