Image segmentation refers to the process of partitioning a digitized image into meaningful segments with homogeneous characteristics, e.g., color, intensity or texture. Due to the wide range of image context, quality and complexity, fully-automatic image segmentation seems to be challenging, and therefore, in recent years, semi-automatic or interactive segmentation has been comprehensively studied. As its name implies, the segmentation algorithm is to be guided by the interaction with users, via one or several rounds, so that it is able to intelligently learn from user-marked examples of foreground and background, get clues on how the user defines the target, and apply the learned rule to other unmarked areas.
A general interactive segmentation framework includes a friendly user interface which enables a user to impose certain hard constraints, anywhere within the image domain, to mark certain pixels to be absolutely part of the image foreground and background, respectively. Although the total number of marked pixels is very limited, sometimes no more than several hundred, they are the interactive clues which are able to be utilized by certain machine learning algorithms to train a particular model online. Then, this online-trained model is applied to the rest of the image for prediction purposes. By presenting the prediction results to the user, a second-round interaction is to be given again if there is a need to make modifications. This interactive procedure repeats until the user feels satisfied. FIG. 1 illustrates such an online model learning-based interaction framework.
Comprehensive studies have been conducted trying to optimize each of the three modules.
Regarding interaction with users, people are seeking loose inputs to mark foreground and background. Loosely positioned marking lines, like the strokes given by a brush are more preferable than precise boundary definition. An exemplary case is shown in FIG. 2, where red (200) and blue (202) strokes are indicating foreground and background respectively.
Regarding model learning, supervised learning has been studied a lot in computer vision, and many classic supervised learning algorithms are already developed, such as Supporting Vector Machine (SVM), boosting and others. Statistics-based modeling, such as Gaussian Mixture Model (GMM) is also frequently mentioned in many literatures.
Regarding prediction, it is implemented by applying the learned model to the rest of image. The output of such an inference procedure is a likelihood map, indicating the probability of each pixel to be classified as foreground or background.
An alternative to model learning/prediction is to define a cost function integrating user-marked foreground and background, and the segmentation is found by optimizing this cost function among all the possible segmentations. A typical example is graph cuts which has been widely used in interactive segmentation.