Multispectral and hyperspectral images are composed of amounts of data that are impractical to analyze manually. These data include multiple spectral bands that are not visualized or assessed readily. Conventional multi-spectral sensors provide only a few spectral bands of imagery, nominally covering pre-specified portions of the visible to the near infrared spectra. Conventional hyperspectral sensors may cover hundreds of spectral bands spanning a pre-specified portion of the electromagnetic spectrum. Thus, hyperspectral sensors may provide greater spectral discrimination than multispectral sensors and allow non-literal processing of data to detect and classify material content as well as structure.
An image may be represented mathematically as a matrix of m rows and n columns of elements. An element of such a matrix defining a two-dimensional (2-D) image is termed a picture element, or pixel. An image is usable when a viewer is able to partition the image into a number of recognizable regions that correspond to known features, such as trees, lakes, and man-made objects. Once this level of imaging is attained, each distinct feature and object may be identified since each is represented by an essentially uniform field. The process that generates such uniform fields is known as segmentation.
Many techniques have been used to segment images. Segmentation may be class-interval based, edge-based, and region-based.
For 8-bit precision data, a given image may assume pixel (element) values from a minimum of zero to a maximum of 255. By mapping into one category those pixels whose intensity values are within a certain range or class interval, e.g., 0-20, a simple threshold method may be used to segment.
An edge may be defined by observing the difference between adjacent pixels. Edge-based segmentation generates an edge map, linking the edge pixels to form a closed contour. In conventional edge-based segmentation, well-defined mathematical formulae are used to define an edge. After edges are extracted, another set of mathematical rules may be used to join, eliminate, or both join and eliminate edges, thus generating a closed contour around a uniform region. That is, the scene itself is not used to define an edge even though, globally, an edge may be defined by the scene.
Region-based segmentation is the antithesis of edge-based segmentation. It begins at the interior of a potentially uniform field rather than at its outer boundary. It may be initiated with any two interior adjacent pixels. One or more rules, such as a Markov Random Field (MRF) approach, are used to decide whether merging of these two candidates should occur. In general, conventional region-based segmentation is performed on an image within but a single spectral band, follows well-defined mathematical decision rules, is computationally intensive, and thus expensive, and is not self-determining or self-calibrating.
Color-based segmentation requires input of three spectrally distinct bands or colors. A true color video image may be generated from a scene taken by three bands of blue, green and red. They may be combined into a composite image using individual filters of the same three colors. The resultant color image may be considered a segmented image because each color may represent a uniform field.
If a region or an edge may be generated from the content of the scene, it should be possible to integrate both region-based and edge-based segmentation methods into a single, integrated process. The process by which a segment, or region, is matched with a rule set, or model, is termed identification.
Identification occurs after segmentation. It results in labeling structure using commonly-accepted names, such as river, forest or automobile. While identification may be achieved in a number of ways, such as statistical document functions and rule-based and model-based matching, all require extracting representative features as an intermediate step. Extracted features may be spectral reflectance-based, texture-based, and shape-based.
Statistical pattern recognition exploits standard multivariate statistical methods. Rule-based recognition schemes use conventional artificial intelligence (AI). Shape analysis employs a model-based approach that requires extraction of features from the boundary contour or a set of depth contours. Sophisticated features that may be extracted include Fourier descriptors and moments. Structure is identified when a match is found between observed structure and a calibration sample. A set of calibration samples constitutes a calibration library. A conventional library is both feature and full-shape based.
Feature extraction utilizes a few, but effective, representative attributes to characterize structure. While it capitalizes on economy of computation, it may select incorrect features and apply incomplete information sets in the recognition process. A full-shape model assumes that structure is not contaminated by noise, obscured by ground clutter, or both. In general, this assumption does not correspond to the operation of actual sensors.
Depth contours match three-dimensional (3-D) structure generated from a sensor with 3-D models generated from wire frames. In general, all actual images are 3-D because the intensity values of the image constitute the third dimension, although all are not created equal. For example, a LADAR image has a well-defined third dimension and a general spectral-based image does not. However, most objective discrimination comes from the boundary contour, not the depth contour.
Detection, classification (segmentation), and identification techniques applied to hyperspectral imagery are inherently either full-pixel or mixed-pixel techniques in which each pixel vector in the image records the spectral information. Full-pixel techniques operate on the assumption that each pixel vector measures the response of one predominate underlying material, or signal, at each site in a scene. However, the underlying assumption for mixed-pixel techniques is that each pixel vector measures the response of multiple underlying materials, or signals, at each site. In actuality, an image may be represented best by a combination of the two. Although some sites represent a single material, others are mixtures of multiple materials. Rand, Robert S. and Daniel M. Keenan, A Spectral Mixture Process Conditioned by Gibbs-Based Partitioning, IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 7, pp. 1421-1434, July 2001.
The simplest full-pixel technique involves spectral matching. Spectra of interest in an image are matched to training spectra obtained from a library or the image itself. Metrics for determining the degree of match include: Euclidian distance, derivative difference, and spectral angle. If the relative number of mixed pixels in a scene is significant, then spectral matching of this type is not employed. Class label assignments generated by spectral matching algorithms are not affected by spatial neighborhoods, however, consistency of class labels in localized spatial neighborhoods, termed “spatial localization,” is important in mapping applications.
Other full-pixel methods include various supervised and unsupervised segmentation techniques. These are based on statistical and pattern recognition methods normally applied to multispectral image processing. The training is also done using data from libraries or the scene imagery itself. Specific techniques include: statistical linear discrimination, e.g., Fisher's linear discriminant; quadratic multivariate classifiers, e.g., Mahalanobis and Bayesian maximum likelihood (ML) classifiers; and neural networks.
The quadratic methods require low-dimensional pixel vectors, and thus are preceded by a data reduction operation to reduce the number of spectral bands addressed. Effective neural networks, such as the multilayer feedforward neural network (MLFN), may be built to model quadratic and higher order decision surfaces without data reduction. Although the MLFN may be trained to identify materials perturbed by a limited amount of mixing, usually it does not include any spatial localization in the decision process.
The most common unsupervised algorithms for clustering imagery are KMEANS and ISODATA, in which the metric used in determining cluster membership is Euclidian distance. Euclidian distance does not provide an adequate response when fine structure or shapes are presented in high resolution spectra, being overly sensitive to intensity changes. Additionally, these methods do not include any spatial localization in the clustering operation.
Spectral mixture analysis (SMA) techniques used with mixed-pixel approaches address some of the shortcomings of full-pixel techniques. SMA employs linear statistical modeling, signal processing techniques, or both. SMA techniques are governed by the relationship:Xs=Hβs+ηs  (1)where:                Xs=observed reflected energy from site s        βs=modeling parameter vector associated with mixture proportions at site s        ηs=random variable for model error at site s        H=the matrix containing the spectra of pure materials of interest        
The matrix, H, is presumed known and fixed, although for most actual materials there exists no single fixed spectral signatures to represent the pure materials.
The basic SMA may be modified to partition the H matrix into desired and undesired signatures. Subspace projections orthogonal or oblique to the undesired signatures and noise components are computed. Orthogonal subspace projection (OSP) is applied to hyperspectral imagery to suppress undesirable signatures and detect signatures of interest. This is shown in the relationship:H=[D,U]  (2)where:                D=matrix of known spectra for a target of interest        U=the matrix of undesired, but known, spectraThe matrix, U, may be unknown if D is a minor component of the scene.        
The above modifications are best suited to targeting applications rather than mapping.
From the early days of multi-spectral remote sensing to the present, earth scientists have been thoroughly and meticulously measuring the wavelengths and intensity of visible and near-infrared light reflected by the land surface back up into space. In some instances, they have used either a “Vegetation Index” (VI) or a “Normalized Difference Vegetation Index” (NDVI) to quantify the concentrations of green leaf vegetation around the planet. Weier, J. and D. Herring, Measuring Vegetation, NDVI and EVI, Earth Observatory, September 1999. These indices may be described mathematically as:
                                          V            ⁢                                                  ⁢            I                    =                                    b              1                                      b              2                                      ⁢                                  ⁢                  And          ⁢                                          ⁢          N          ⁢                                          ⁢          D          ⁢                                          ⁢          V          ⁢                                          ⁢          I          ⁢                                          ⁢          as          ⁢                      :                                              (        3        )                                          N          ⁢                                          ⁢          V          ⁢                                          ⁢          D          ⁢                                          ⁢          I                =                              (                                          b                1                            -                              b                2                                      )                                (                                          b                1                            +                              b                2                                      )                                              (        4        )            
Where
VI is vegetation index,
NDVI is the Normalized Difference Vegetation Index, and
b1 is a near infrared spectral band, and
b2 is a visible spectral band.
VI and NDVI involve mathematical operations of a combination of two or more bands aimed at enhancing vegetation features. VI and NDVI yield estimates of vegetation health, provide a means of monitoring changes in vegetation relative to biomass and color, and serve as indicators of drought, climate change, precipitation, and the like. Kidwell, K. B., Global Vegetation Index User's Guide, U.S. Department of Commerce/National Oceanic and Atmospheric Administration, July 1997 (estimate of health); Boone, R., K. Galvin, N. Smith, and S. Lynn, Generalizing El Nino Effects upon Maasai Livestock Using Hierarchical Clusters of Vegetation Patterns, Photogrammetric Engineering & Remote Sensing, Vol. 66(6): pages 737-744, June 2000 (monitoring changes); Kassa, A., Drought Risk Monitoring for the Sudan, Master of Science Dissertation, University College, London, UK, August 1999 (indicator of climate).
The usefulness of the VI and NDVI is well documented and it is clear that these techniques have contributed substantial information to vegetation studies and other investigations using remote sensing. VI and NDVI have been suggested as means for identifying features other than vegetation, but these suggestions have not been aggressively investigated. Deer, P. J., Digital Detection Techniques: Civilian and Military Applications, International Symposium on Spectral Sensing Research, Melbourne, Australia, November, 1995.
Analysis of all possible ratio combinations in hyperspectral data approaches mathematical chaos, thus, it was postulated that other difference-sum band ratios may provide unexpected relationships yielding useful information about terrain features, both in multi-spectral and hyperspectral data. Embodiments of the present invention address reducing the number of ratio combinations to identify multiple object classes, not just vegetation.
Advances in hyperspectral sensor technology provide high quality data for the accurate generation of terrain categorization/classification (TERCAT) maps. The generation of TERCAT maps from hyperspectral imagery can be accomplished using a variety of spectral pattern analysis algorithms; however, the algorithms are sometimes complex, and the training of such algorithms can be tedious. Further, hyperspectral imagery implies large data files since contiguous spectral bands are highly correlated. The correlation further implies redundancy in classification/feature extraction computations.
The use of wavelets to generate a set of “Generalized Difference Feature Indices” (GDFI) transforms a hyperspectral image cube into a derived set of GDFI bands. Each index is a “derived band” that is a generalized ratio of the originally available bands. Thus, select embodiments of the present invention generate a set of derived bands. For example, an index may be generated with a Daubechies wavelet with two (2), four (4), eight (8) or more “vanishing moments.”
Vanishing moments, filter and smoothing coefficients, and low and high frequency coefficients are all related to the order of the wavelet, and these terms are sometimes used interchangeably, e.g., a wavelet of order four (4) may be referred to as “a wavelet with four (4) vanishing moments.” A wavelet with x vanishing moments, i.e., order x, means that the first x moments starting from zero (0) moment are equal to zero (0). This suppresses signals that are of a polynomial of degree less than or equal to x−1 .
The number of filter coefficients is chosen when the order of the wavelet is established. For example, in research into data mining for select embodiments of the present invention, the initial effort started with [Daubechies 2, lag 3], i.e., the Haar wavelet, and increased both Daubechies order and lag to perform efficient data mining. The collection of these derived bands becomes the indices for the specific feature of interest. For example if the difference-sum ratios and Daubechies wavelets with one vanishing moment, i.e., two (2) filtering coefficients, and lag of 3 that “identifies” or “highlights” roads and roads may also be identified with two vanishing moments, i.e., four (4) filtering coefficients, and a lag of 5, then the indices for roads may be described as [Daubechies 2, lag 3] and [Daubechies 4, lag 5].
A commonly known special case of a GDFI is a Limited Difference Feature Index (LDFI) approach as described above for the Normalized Difference Vegetation Index (NDVI). Numerous other limited band-ratio indices readily identifying individual specific scene features are LDFIs, i.e., single purpose special cases of the GDFI. Generating a set of GDFI bands is fast and simple. However, there are a large number of possible bands and only a few “generalized ratios (indices)” prove useful. Judicious data mining of the large set of GDFI bands produces a small subset of GDFI bands suitable to identify specific TERCAT features.
In select embodiments of the present invention, a wavelet-based difference-sum band ratio method reduces the computation cost of classification and feature extraction (identification) tasks. A Generalized Difference Feature Index (GDFI), computed using wavelets such as Daubechies wavelets, is employed in an embodiment of the present invention as a method to automatically generate a large sequence of generalized band ratio images. Other wavelets, such as Vaidyanathan, Coiflet, Beylkin, and Symmlet and the like may be employed in select embodiments of the present invention. Selection of the optimum wavelet is important for computational efficiency. Simental, E., and T. Evans, Wavelet De-noising of Hyperspectral Data, International Symposium for Spectral Sensing Research, San Diego, Calif., June 1997.
A description of a method for analyzing a signal by wavelets is provided in U. S. Pat. No. 5,124,930, Method for Analyzing a Signal by Wavelets, to Nicolas et al., Jun. 23, 1992, incorporated herein by reference.
The classification and feature extraction performance of a band ratio method of the present invention was comparable to results obtained with the same data sets using much more sophisticated methods such as discriminants and neural net classification and endmember Gibbs-based partitioning. Rand, R. S., and E. H. Bosch, The Effect of Wavelet-based Dimension Reduction on Neural Network Classification and Subpixel Targeting Algorithms, SPIE Defense and Security Symposium, Orlando, Fla., April 2004 (discriminants and neural net). Rand and Keenan (2001) (endmember partitioning). The performance of an embodiment of the present invention was comparable to results obtained from a similar data set using genetic algorithms. Simental, E., D. Ragsdale, E. Bosch, R. Dodge Jr., and R. Pazak, Hyperspectral Dimension Reduction and Elevation Data for Supervised Image Classification, American Society for Photogrammetry and Remote Sensing Conference, Anchorage, Ak., May 2003.
Select embodiments of the present invention extract (identify) features from hyperspectral imagery rapidly and reliably, inexpensively permitting ready identification of pre-specified features.