With the advent of recent advancements in the field of computer vision and video processing, various models have been proposed for automatic and/or computational identification of salient objects in an image and/or a video stream. Identification of the salient objects has various applications in the field of video surveillance, image retargeting, video summarization, robot control, navigation assistance, object recognition, adaptive compression, and/or the like. The identification of the salient objects is further useful in image processing techniques, such as auto-focus algorithms, wherein detection of a focus area is performed automatically for video and/or image capturing devices.
Typically, a salient object may be identified based on detection of region of attention (or region-of-interest) of a viewer. This region-of-interest may appear amongst the foreground objects within the scene. Most computer vision models require a set of basic visual characteristics, such as color contrast, intensity, orientation, texture, motion, spatial distance, and/or the like, to generate such saliency maps. Examples of the saliency maps may include, but not limited to, a spatial saliency map, a spatio-temporal saliency map, or a ground truth saliency map. Consequently, identification of the salient objects occurs thereafter.
In a scenario, the spatial saliency map may be generated to highlight the salient objects that are based on the disparity of visual features with respect to the surroundings. This process suppresses the non-salient objects. Alternatively, the spatio-temporal saliency map may be generated to highlight the salient objects based on disparity of motion features in each frame of one or more objects in a video scene, taking spatial features into account. Further, the ground truth saliency map may be generated to highlight the salient objects based on the eye fixation data of the viewer. However, in such scenarios, the identified salient objects may differ in accordance with different saliency maps. Thus, it may be desirable to determine consolidated salient objects in the scene, based on combination of such different types of saliency maps.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of described systems with some aspects of the present disclosure, as set forth in the remainder of the present application and with reference to the drawings.