This description relates to image and data segmentation.
The problem of segmentation or identification of objects or groups in data sets has applications in a wide number of fields. In the field of automated image processing and computer vision, problems including detecting boundaries between objects and detecting the presence of objects can be a difficult problem. For example, the problem of boundary detection has a long history in computer vision. In some approaches in a first era, boundaries were detected using gradient filters. In some approaches in a second era, contextual information was incorporated through optimization of objective functions based on Markov random fields, graph partitioning, and other formalisms. Today a new trend is to use machine learning to improve accuracy by training a computer to emulate human boundary judgments. In a typical approach, a boundary detector is trained by minimizing its pixel-level disagreement with humans.
One area of image processing in which boundary or object detection is important is in analysis of images of biological tissue. For instance, recent advances in electron microscopy (EM) have enabled the automated collection of nanoscale images of brain tissue. The resulting image datasets have renewed interest in automated computer algorithms for analyzing EM images. From the neuroscience perspective, development of such algorithms is important for the goal of finding connectomes, complete connectivity maps of a brain or piece of brain. To find connectomes, two image analysis problems must be solved. First, each synapse must be identified in the images. Second, the “wires” of the brain, its axons and dendrites, must be traced through the images. If both problems are solved, then it would be possible to trace the wires running from every synapse to the cell bodies of its parent neurons, thereby identifying all pairs of neurons connected by synapses. Other instances in which boundary or object detection in biological tissue is important is in analysis of medical images formed using computed tomography (CT) and magnetic resonance imaging (MRI) techniques, for example, locating and determining the extent of a tumor in the imaged tissue.
One approach to these problems in image processing makes use of boundary detection based on local image characteristics. For instance, pixels in an image are automatically labeled as boundaries versus object, and based on the labeled pixels, the image is segmented into objects. The automated labeling of boundary pixels can be based on a parametric technique, and the parameters are determined in a training procedure in which a set of training images and their corresponding boundary labeling are used to determine the parameters to be used in processing unknown images. For example, training images are analyzed manually to identify boundary pixels in the images, and the parameters are optimized to minimize the error between predicted and hand labeled boundary pixels (e.g., using a Hamming distance metric). A measure of accuracy of such approaches can use a metric based on the resulting segmentation of the image. For example, the Rand Index provides the average number of pairs of pixels in an image that are correctly identified as belonging to the same versus different segments or objects in the image.