There are two main approaches to statistical classification of image data, termed supervised and unsupervised. Supervised approaches take examples of spectra provided by a user and identify which example most resembles a pixel in an image dataset. Unsupervised methods break the image into a number of similar clusters, and require a user to assign a posteriori labels to these clusters.
Supervised statistical methods in image processing, however, typically do not support automated methods of collecting pixels for contrasting features in the image. In addition, although probability estimates from supervised methods are calculated and presented in existing image processing software, typical methods do not use probability information in an intelligent way to extend the collection of samples from a user-specified point or for any established logic for iteration. As a result, many existing methods and tools are labor intensive or require substantial user experience to effectively be used with large multidimensional datasets, or otherwise do not utilize user experience effectively, much less in a relatively simple manner.
Although existing methods for unsupervised classification generally use iterative methods, they have limited ability to identify specific features and can suffer from one or more of the following disadvantages:                Labor intensive. A posteriori identification of statistical clusters that relate to “the feature(s)” can require tedious manual assessment.        Difficulty dealing with false positives. Without a priori definition of “the feature(s)” of interest in the data, there is a relatively high probability that a clustering algorithm will split “the feature(s)” among a number of clusters that each have unique problems with false positives.        Failure to use relevant statistical measures. Any measure of spectral similarity or probability that results from the clustering will be relative to each of the various cluster centers that are selected, rather than representing a consistent probabilistic relationship to the mean and variance for “the feature(s)” in the various dimensions of measurement in the dataset. Such data is typically of reduced utility and can be incompatible with an automated, iterative process for extracting (or suppressing) particular features (false positives).        
A number of methods exist for extracting information from datasets that are not statistical in nature. Such methods are typically based on assessing the geometry of the spectral curve associated with each pixel relative to either a single (e.g. matched filtering) or multiple (e.g. spectral mixture modeling) examples. Simple calculations of the similarity of pixels to a single example spectra, such as matched filtering, generally do not benefit from the properties of statistical modeling in terms of providing a framework for variable selection and the weighting of spectral ranges where there is more information (signal) over those that are very similar to other targets, have little information, or introduce excessive noise. The process of selecting wavelength ranges to optimize the performance of methods like matched filtering typically requires detailed manual interaction, technical proficiency, and phenomenological knowledge. Even then, the ability to weight different spectral ranges is generally not present.
A further technique called “mixture-tuned matched filtering” uses more sophisticated consideration of the interrelationships of various image components by using overall image statistics to flag possible false positives. However, the aforementioned benefits of a statistical approach are not realized. In spectral mixture modeling methods, the user develops a number of reference spectra, referred to as end-members, to describe all the significant spectral variation in the image (target and non-target). This approach typically requires intensive interaction and technical knowledge to develop appropriate endmembers. There is an automated form of spectral mixture modeling in the ENVI software package, but it has the same problems that were mentioned above for unsupervised methods when faced with features that are well represented by individual ideal examples, and it does not support any of the efficient manual interaction described herein.
The aforementioned non-statistical methods have demonstrated difficulty in reliably identifying features that have a substantial amount of inherent variability in measured characteristics. Thus, in the example of a dataset comprising a satellite-based index of photosynthetic activity at different times of the year that is sampled over a range of map coordinates (e.g. latitude/longitude), different trees of a given species may be of varying size, blooming at different times, diseased, damaged, or stressed.
One example of a semi-automated feature extraction method that combines a user-specified spectral curve in conjunction with an unsupervised method of explaining residual image variability is Boardman and Kruse, 1994, Automated spectral analysis: a geological example using AVIRIS data, north Grapevine Mountains, Nev., in ERIM Tenth Thematic Conference on Geologic Remote Sensing (Environmental Research Institute of Michigan, Ann Arbor, Mich.) pp. I-407 to I-418, incorporated by reference herein in its entirety. The method they describe, however, is based on the non-statistical method of spectral mixture modeling, so it typically does not perform well with features (user-selected or computer selected) that have substantial variability in measurements or where there are undesired image components that have similar measurements to the feature of interest. Their method does not allow variable selection from the set of measurements or weighting of variables to maximize statistical differences. Once a result is calculated, that existing method has no logic to refine the results, either automatically or based on user feedback. This method also generally provides no information on a data transformation that would convert measurements directly into an estimate of the probability of purity of the feature in an image.
Another recent approach in the general area of spectral feature recognition is the VIPER-tools product developed by Dar Roberts and others. Like the Boardman and Kruse method above, VIPER tools proceed using a mixture modeling approach that measures the degree to which a pixel matches individual reference spectra. However, VIPER-tools do attempt to characterize the potential heterogeneity of target features by providing tools to collect a number of characteristic spectra for the image. The method presented herein is substantially different from VIPER-tools in its ease of use, effective utilization of information of per-pixel probabilities, and iterative framework. VIPER-tools presents the user with complex abstract plots to describe the variation in images prior to classification, the user is required to review an extensive list of tabular feedback to help select representative spectra, there is no framework for variable selection to select the best wavelengths, and the approach assumes substantial technical expertise. Though VIPER-tools is based on multiple endmember spectral mixture modeling, which is designed to deal with variable mixtures of image components, as of this writing there is no straightforward way to get a result that simply presents the proportion of a given image component independent of all other features in the image (this limitation was confirmed by Kerry Halligan, one of the VIPER-tools developers).