Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, such as segmentation, image parsing, classification etc. Grouping similar pixels into so called superpixels leads to a major reduction of the image primitives. This results in an increased computational efficiency for subsequent processing steps, allows for more complex algorithms computationally infeasible on pixel level, and creates a spatial support for region-based features.
Superpixel algorithms group pixels into superpixels. As indicated in X. Ren et al.: “Learning a classification model for segmentation”, IEEE International Conference on Computer Vision (ICCV) 2003, pp. 10-17, superpixels are local, coherent, and preserve most of the structure necessary for segmentation at scale of interest. As further stipulated in the above document, superpixels should be roughly homogeneous in size and shape. Though many superpixel approaches mostly target still images and thus provide only a limited or no temporal consistency at all when applied on video sequences, some approaches target video sequences. See, for example, O. Veksler et al.: “Superpixels and Supervoxels in an Energy Optimization Framework”, in Computer Vision—ECCV 2010, vol. 6315, K. Daniilidis et al., Eds. Springer Berlin/Heidelberg, 2010, pp. 211-224, or A. Levinshtein et al.: “Spatiotemporal Closure”, in Computer Vision—ACCV 2010, vol. 6492, R. Kimmel et al., Eds. Springer Berlin/Heidelberg, 2011, pp. 369-382. These approaches start to deal with the issue of temporal consistency.