Majority of tracking algorithms assumes predetermined target location and size for initialization of tracking. Hence, in many applications, target size and location are required as input from human-users. However, target initialization can drastically change the performance of the tracker since this initial window tells the tracker what to track, i.e. the features, appearance, contours. Hence, any insignificant or false information, i.e. parts of objects similar to common background or patches from background may result in a mislearning of target appearance. Moreover, in real time applications, erroneous input is usually provided by the user due to obligation to mark the target instantly. This erroneous input usually results in track losses prematurely. Therefore, if, long-term tracking performance desired to be achieved this erroneous input should be compensated. In this sense a method is proposed for target initialization which is capable of long-term tracking.
China patent document CN101329767 discloses an automatic inspection method of a significant object sequence based on studying videos. In the method of the invention, first static significant features then dynamic significant features are calculated and self-adaptively combined according to the space continuity of each image of frame and the time continuity of significant objects in neighboring images. Since this method generally takes several seconds to process an image, it is not appropriate for real-time applications.
United States patent document US2012288189, an application in the state of the art, discloses an image processing method which includes a segmentation step that segments an input image into a plurality of regions by using an automatic segmentation algorithm, and a computation step that calculates a saliency value of one region of the plurality of segmented regions by using a weighted sum of color differences between the one region and all other regions. Accordingly, it is possible to automatically analyze visual saliency regions in an image, and a result of analysis can be used in application areas including significant object detection, object recognition, adaptive image compression, content-aware image resizing, and image retrieval. However, change in image resolution result in change in processing time which may exceed real-time application limits.
United States patent document US2008304740, an application in the state of the art, discloses methods for detecting a salient object in an input image are described. For this, the salient object in an image may be defined using a set of local, regional, and global features including multi-scale contrast, center-surround histogram, and color spatial distribution. These features are optimally combined through conditional random field learning. The learned conditional random field is then used to locate the salient object in the image. The methods can also use image segmentation, where the salient object is separated from the image background. However, obviously it is not proper for the real time usage.
United States patent document US20120294476, an application in the state of the art, discloses methods for detecting a salient object in an input image are described. To find the salient objects, a computing device determines saliency measures for locations in the image based on costs of composing the locations from parts of the image outside of those locations. In the beginning of the process, input image is segmented into parts then saliency measures are calculated based on appearance and spatial distances for locations defined by sliding windows. In conclusion, this system can not be proper for the real time usage.
The document titled “Learning to detect a salient object (Tie Liu et al.)” discloses an approach for salient object detection, which is formulated as a binary labeling problem using a set of local, regional, and global salient object features.