Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, such as segmentation, image parsing, classification etc. Grouping similar pixels into so called superpixels leads to a major reduction of the image primitives. This results in an increased computational efficiency for subsequent processing steps, allows for more complex algorithms computationally infeasible on pixel level, and creates a spatial support for region-based features.
Superpixel algorithms group pixels into superpixels. As indicated in X. Ren et al.: “Learning a classification model for segmentation”, IEEE International Conference on Computer Vision (ICCV) 2003, pp. 10-17, superpixels are local, coherent, and preserve most of the structure necessary for segmentation at scale of interest. As further stipulated in the above document, superpixels should be roughly homogeneous in size and shape. Though many superpixel approaches mostly target still images and thus provide only a limited or no temporal consistency at all when applied on video sequences, some approaches target video sequences. See, for example, O. Veksler et al.: “Superpixels and Supervoxels in an Energy Optimization Framework”, in Computer Vision—ECCV 2010, vol. 6315, K. Daniilidis et al., Eds. Springer Berlin/Heidelberg, 2010, pp. 211-224, or A. Levinshtein et al.: “Spatiotemporal Closure”, in Computer Vision—ACCV 2010, vol. 6492, R. Kimmel et al., Eds. Springer Berlin/Heidelberg, 2011, pp. 369-382. These approaches start to deal with the issue of temporal consistency.
One state of the art approach for generating temporally consistent superpixels is detailed in the European Patent Application EP 2 680 226 A1. The approach is based on energy-minimizing clustering. It conceives the generation of superpixels as a clustering problem.
European Patent Application EP 2 733 666 A1 describes a further solution for generating temporally consistent superpixels, which includes a life-cycle management of the superpixels. A life-span, i.e. a duration, is determined for temporally consistent superpixels. Superpixels that grow too large are split and superpixels that become too small are terminated. The number of splits and terminations is kept balanced. For this purpose the development of the area occupied by each superpixel over time is monitored. In addition, a similarity check is introduced for the instances of a temporally consistent superpixel in a sliding window. The similarity between two or even more instances of a temporally consistent superpixel within the sliding window is determined. If it is below a certain threshold, the instances of the superpixel in all future frames of the sliding window are replaced by instances of a new temporally consistent superpixel starting at the first future frame.