Instance segmentation is a task that may combine requirements from both semantic segmentation and object detection, and may require both pixel-wise semantic labeling and instance labeling to differentiate each object at a pixel level. Because semantic labeling may be obtained from an existing semantic segmentation approach, most instance segmentation methods focus on dealing with the instance labeling problem. This may be achieved by assigning a unique identifier to all of the pixels belonging to an object instance.
Instance labeling may become a more challenging task when occlusions occur, or when a vastly varying number of objects in a cluttered scene exist. Techniques to solve instance segmentation may include proposal-based methods and proposal-free methods. In proposal-based methods, a set of object proposals and their classes are first predicted, then foreground-background segmentation is performed in each bounding box. In contrast, proposal-free methods exclude predicting object proposals. Both of these approaches may include two stages: 1) learning a representation (e.g. a feature vector, an energy level, breakpoints, or object boundaries) at the pixel level; and 2) grouping the pixels using a clustering algorithm with the learned representation. Additionally, the proposal-free methods may focus on instance labeling and directly leverage the categorical predictions from semantic segmentation for the semantic labeling.