Image semantic segmentation is intended to identify the image regions corresponding directly to objects in an image by labeling each pixel in the image to a semantic category. Contrary to the object recognition which merely detects the objects in the image, semantic segmentation assigns a category label to each pixel to indicate an object to which the pixel belongs. As such, semantic segmentation aims to assign a categorical label to every pixel in an image, which plays an important role in image analysis and self-driving systems.
Researchers have developed an array of weakly supervised segmentation algorithms. The main idea is to take a pool of images known to contain the same object category, and exploit the repeated patterns to jointly segment out the foreground per image. On the one hand, this paradigm is attractive for its low manual effort, especially because such weakly labeled images are readily available on the Web via keyword searches. On the other hand, the resulting segmentations are imperfect. As a result, conventional techniques rely on human-provided segmentations, which are accurate but too expensive, or automatic segmentations, which are inexpensive but too inaccurate.