For many years, linear mixing, for example as described in J. W. Boardman, “Automating spectral unmixing of AVIRIS data using convex geometry concepts,” in Proc. 4th Annu. JPL Airborne Geosciences Workshop, vol. 1, Pasadena, Calif., 1993, JPL Pub. 93-26, pp. 11-4. and “best band” combinations have been de facto standards in the analysis of spectral data, especially hyperspectral. The “best band” approach, for example as described in R. N. Clark, G. A. Swayze, C. Koch, A. Gallagher, and C. Ager, “Mapping vegetation types with the multiple spectral feature mapping algorithm in both emission and absorption,” in Summaries of the 3rd Annual JPL Airborne Geosciences Workshop, vol. 1, 1992, AVIRIS Workshop, JPL Pub. 92-14, pp. 60-62, relies on the presence of narrowband features that may be characteristic of a particular category of interest or on known physical characteristics of broad classes of data, e.g., vegetation indices, as described in “Designing optimal spectral indexes for remote sensing applications,” M. M. Verstraete and B. Pinty, IEEE Trans. Geosci. Remote Sens., vol. 34, no. 5, pp. 1254-1264, September 1996. On the other hand, the underlying assumption of linear mixing is that each pixel in a scene may be decomposed into a finite number of constituent endmembers, which represent the purest pixels in the scene. A number of algorithms have been developed and have become standards; these include the pixel purity index (PPI), described in “Mapping target signatures via partial unmixing of AVIRIS data,” J. W. Boardman, F. A. Kruse, and R. O. Green, Summaries 5th JPL Airborne Earth Science Workshop, vol. 1, 1995, JPL Pub. 95-1, pp. 23-26, ORASIS, described in J. Bowles, P. Palmadesso, J. Antoniades, M. Baumback, and L. J. Rickard, “Use of filter vectors in hyperspectral data analysis,” Proc. SPIE, vol. 2553, pp. 148-157, 1995., N-Finder, described in “Fast autonomous spectral endmember determination in hyperspectral data,” M. E. Winter, Proc. 13th Int. Conf. Applied Geologic Remote Sensing, vol. 11, Vancouver, BC, Canada, 1999, pp. 337-344, and iterative spectral unmixing, described in “Iterative spectral unmixing,” F. Van Der Meer, Int. J. Remote Sens., vol. 20, no. 17, pp. 3431-3436, 1999.
Although the use of endmembers and indexes based on narrowband features have yielded very useful results, these approaches largely ignore the inherent nonlinear characteristics of hyperspectral data. There are multiple sources of nonlinearity. One of the more significant sources, especially in land-cover classification applications, stems from the nonlinear nature of scattering as described in the bidirectional reflectance distribution function (BRDF) (S. R. Sandmeier, E. M. Middleton, D. W. Deering, and W. Qin, “The potential of hyperspectral bidirectional reflectance distribution function data for grass canopy characterization,” J. Geophys. Res., vol. 104, no. D8, pp. 9547-9560, April 1999.). In land-cover applications, BRDF effects lead to variations in the spectral reflectance of a particular category as a function of position in the landscape, depending on the local geometry. Factors that play a role in determining BRDF effects include the optical characteristics of the canopy, canopy gap function, leaf area index (LAI), and leaf angle distribution (LAD), described in “The potential of hyperspectral bidirectional reflectance distribution function data for grass canopy characterization,” S. R. Sandmeier, E. M. Middleton, D. W. Deering, and W. Qin, J. Geophys. Res., vol. 104, no. D8, pp. 9547-9560, April 1999, in which it has also been observed that wavelengths with the smallest reflectance exhibit the largest nonlinear variations. Another source of nonlinearity, especially in coastal environments such as coastal wetlands, arises from the variable presence of water in pixels as a function of position in the landscape. Water is an inherently nonlinear attenuating medium. Other effects that contribute to nonlinearities include multiple scattering within a pixel and the heterogeneity of subpixel constituents (D. A. Roberts, M. O. Smith, and J. B. Adams, “Green vegetation, non-photosynthetic vegetation, and soils in AVIRIS data,” Remote Sens. Environ., vol. 44, pp. 255-269, 1993.). In some instances, nonlinear interactions have been modeled explicitly (K. Guilfoyle, M. L. Althouse, and C.-I. Chang, “Further investigations into the use of linear and nonlinear mixing models for hyperspectral image analysis,” Proc. SPIE, vol. 4725, pp. 157-167, 2002). Recently, a number of papers that address the problem of modeling nonlinear data structure have appeared in the statistical pattern recognition literature. These approaches represent an attempt to derive a coordinate system that resides on (parameterizes) the nonlinear data manifold itself and are all data-driven algorithms, not physical or phenomenological models. Nevertheless, they are a very powerful new class of algorithms that can be brought to bear on many high-dimensional applications that exhibit nonlinear structure, e.g., the analysis of remote sensing imagery (T. L. Ainsworth and J. S. Lee, “Optimal image classification employing optimal polarimetric variables,” in Proc. IGARSS, vol. 2, Toulouse, France, 2003, pp. 696-698). One of the first of these approaches to appear—isometric mapping (ISOMAP), described in “A global geometric framework for nonlinear dimensionality reduction,” J. B. Tenenbaum, V. de Silva, and J. C. Langford, Science, vol. 290, pp. 2319-2323, December 2000, (“Tenenbaum et al.”), incorporated herein by reference determines a globally optimal coordinate system for the nonlinear data manifold, but it is only practical for small datasets because the dominant computation is based on a determination of all pairwise distances and minimal path distances between all points, which scales as N3. Equivalently, we will subsequently say that the scaling is “of order N3” which we will denote by the symbol O(N3) here and in similar expressions throughout this patent application. Here N is the number of data samples (in our case, the number of pixels in a hyperspectral scene or subset). A subsequent paper, “The ISOMAP algorithm and topological stability,” M. Balasubramanian, E. L. Schwartz, J. B. Tenenbaum, V. de Silva, and J. C. Langford, Science, vol. 295, no. 5552, p. 7a, January 2002 (“The ISOMAP algorithm”), showed that the computational scaling can be improved to O(N2 log(N)), using Dijkstra's algorithm (R. Sedgewick, Algorithms in C++, Part 5: Graph Algorithms. Reading, Mass.: Addison-Wesley, 2002) incorporated herein by reference. Another significant computational challenge is that ISOMAP memory requirements scale as O(N2), because ISOMAP requires the extraction of dominant eigenvectors of an N×N geodesic distance matrix.
An alternative approach, local linear embedding (LLE) (S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323-2326, December 2000) describes the manifold by modeling each data point as a linear combination of its neighbors; this approach exploits the fundamental property that a manifold is locally linear. Like ISOMAP, it defines a neighborhood in terms of an absolute distance scale, or in terms of number of neighbors, where linearity is expected to be true. An embedding is determined by noting that the same geometric properties of neighborhood reconstruction should apply equally well to an embedded lower dimensional description up to an affine transformation (translation, rotation, and resealing). Like the Dijkstra implementation of ISOMAP, the largest computational operations in LLE also scale as O(N2 log(N)), however, LLE is not guaranteed to discover the optimal global coordinate system and appears to be more vulnerable to noise (see response of Tenenbaum, de Silva, and Langford in “The ISOMAP algorithm”). The fastest available approach for estimating manifold coordinates is stochastic proximity embedding (SPE) (D. K. Agrafiotis and H. Xu, “A self-organizing principle for learning nonlinear manifolds,” Proc. Nat. Acad. Sci., vol. 99, no. 25, pp. 15 869-15 872, December 2002.) because its fundamental computational burden scales as O(N); however, the simplifying assumptions that appear in estimating geodesic distances and the embedded approximation appear to be too weak and have lead to degenerate solutions when applied to hyperspectral data and even in some very simple, artificial problems.
Manifold Coordinate Systems and Hyperspectral Data
There are a number of sources of nonlinearities in hyperspectral data: BRDF, nonlinear media such as water, multiple scattering, and the heterogeneity of pixels. In a hyperspectral scene containing natural vegetation, the nonlinear characteristics are immediately apparent in many three-channel scatterplots. FIG. 1 illustrates this point, showing three-channel scatter plots from two subsets of a 124-channel PROBE2 hyperspectral scene. Choosing any three channels will provide a different view of the high-dimensional data manifold. It would therefore be desirable to derive a coordinate system that resides on (parameterizes) the data manifold itself, following its intricate and convoluted structure with the hope of achieving a better data representation for classification and/or compression purposes. FIG. 1 also illustrates this. When estimating geodesic distances, the distance metric is only applied in a small, locally linear neighborhood. Distances to samples outside the local neighborhood of a particular sample are calculated by linking the shortest paths through points common to more than one neighborhood. Therefore, one other commonly used metric, the Mahalanobis distance, which is frequently applied in classification and detection problems in hyperspectral imagery analysis, described in “Anomaly detection and classification for hyperspectral imagery,” C.-I Chang and S. Chiang, IEEE Trans. Geosci. and Remote Sens., vol. 40, no. 6, pp. 1314-1325, Jun. 2002, cannot be applied in the context that is often used. Because the Mahalanobis distance implies the calculation of an associated scene covariance matrix, its use would require aggregation of samples that are not within the linear region of the manifold. The only way to apply it within the context of geodesic distance estimation would be to restrict the covariance matrix to those pixels lying within the locally linear neighborhood of a given pixel and producing a neighborhood covariance matrix for each point. While this is possible, it is certainly more computationally expensive. Since the distance metric is only evaluated locally, and far away geodesic distances are calculated using graph algorithms, the specific choice of local metric is less important.
Owing to the computational scaling of ISOMAP with image size that is at best, O(N2 log(N)), there is a need for either faster processing hardware, or a better, more computationally efficient set of algorithms. In particular, using the prior art algorithms, it is not be possible to develop a general set of manifold coordinates for entire scenes of size O(106) samples or greater, regardless of improvements in processor hardware. Manifold coordinate representations of large hyperspectral images, O(106) pixels, is not computationally feasible with present techniques.