It could be beneficial, when performing image processing on an image of a natural scene, to take into account the locations and boundaries of discrete people, animals, plants, objects, buildings, or other contents of such an image. Such image processing tasks could include filtering of the image data (e.g., to remove noise or artifacts), colorization of an image, determining information about contents of the image (e.g., a depth of each pixel in the image, a semantic category label assigned to each pixel in the image), or some other image processing tasks. For example, when colorizing an image, it could be beneficial to apply a small number of manually-specified color values to the rest of the object in order to generate a colorized version of the image. In another example, it may be beneficial to filter and/or upsample a depth map of an image (e.g., a depth map determined from a stereoscopic image pair, or generated by a depth sensor that is aligned with a camera or other image-generating apparatus). Such image processing tasks could be computationally intensive (e.g., requiring large amounts of time, memory, or other computational resources). It could be beneficial to develop improved methods of performing such image processing tasks.