It is known to collect remote sensing data to provide images of scenes to aid in broad scale discrimination of various features of land scanned including identifying mineral deposits and vegetation Two examples of hyperspectral scanners are NASA's 224 band AVIRIS, which has bands spaced about every 10 nanometers in a range from 400 to 2500 nanometers, and the 128 band Australian commercial scanner, HyMap, which covers a similar wavelength range with about 16 nanometer resolution.
A goal is therefore to identify the components of each pixel in the hyperspectral image. This can be done by comparison with a library of spectra of “pure” materials. “Pure” materials in a hyperspectral image are often termed endmembers.
Depending on the resolution of the image obtained from the spectral scanner, an individual pixel may represent an area ranging in size from 5 to 10 meters across in images from an aircraft scan or 10 to 30 meters across from a satellite scan. Each pixel therefore will relate to a portion of a scene which will usually include a mixture of material components. It is not uncommon to find that not all of the pure spectral representations of endmembers are present in a scene.
Images are also subject to distortion due to noise from various sources including instruments, atmospheric interference, viewing geometry and topography of the area scanned. Corrections for these distortions are still not sufficiently accurate to allow for reliable comparisons to reference libraries. Also, many remotely sensed scenes contain materials not in libraries. Therefore, there are problems with matching spectra with ground-based libraries. There is consequently interest in identifying the component materials represented in a scanned scene, without reference to a library.
Similar problems occur in other fields where it is desired to determine endmembers from multispectral, hyperspectral or other data where a signal is detected on a number of channels or bands. For example a similar problem occurs in the analysis of proteomics and genomics array data where the signal represents cell or organism response across a range of proteins, cDNAs or oligonucleotides. In this context, each protein, cDNA or oligonucleotide is regarded-as being equivalent to a wavelength or band in the hyperspectral or multispectral context. Similar problems also occur in fluorescence imaging such as fluorescence microscopy.
In the art the terms multispectral and hyperspectral, multidimensional and hyperdimensional etc. are used, with “hyper” generally meaning more than “multi”. This distinction is not relevant for the purposes of this invention. For convenience, throughout the rest of the specification the term “multispectral” will be used to refer to both multispectral and hyperspectral data. The term “multidimensional” and other “multi” terms will likewise be used to mean more than one dimension.
Current solutions of finding endmembers often involve “whitening” or “sphering” the data and then fitting to the data a multidimensional simplex having a number of vertices equal to the number of endmembers.
The bands of a multispectral image are usually highly correlated. “Whitening” involves transforming the data to be uncorrelated with a constant variance and preferably an approximately Normal distribution of errors. It is also desirable to compress the dimensionality of the data to reduce calculation time.
A widely used algorithm to “whiten” the data is to compress the information into a smaller number of bands by use of the Minimum Noise Fraction (MNF) transform. This is disclosed in Green, A., Berman, M., Switzer, P., and Craig, M. (1988). A transformation for ordering multispectral data in terms of image quality with implications for noise removal IEEE Transactions on Geoscience and Remote Sensing, 26:65-74.
Simplex fitting using the pixel purity index (PPI) method is disclosed in Boardman, J. Kruse, F., and Green, R (1995) Mapping target signatures via partial unmixing of AVIRIS data. In Gram, R (editor), Summaries of the Fifth Annual JPL Airborne Earth Science Workshop, volume 1, AVIRIS Workshop, pp 23-26. JPL Publ. 95-1, NASA, Pasadena, Calif.
One of the main disadvantages of Boardman's method is that it requires considerable manual intervention in processing.
An alternative to Boardman's method is the N-FINDR algorithm by Winter, M. (1999). Fast autonomous spectral endmember determination in hyperspectral data In Proceedings of the 13th International Committee on Applied Geologic Remote Sensing, Vancouver, vol. 2, pp 337-334. This process is fully automated. After transformation to (M-1) dimensional subspace, this algorithm finds the M-dimensional simplex of maximum volume constrained to lie within the data cloud. Another alternative is to construct the minimum volume simplex enclosing the data cloud, which is provided by Craig, M. (1994). Minimum-volume transforms for remotely sensed data. IEEE Transactions on Geoscience and Remote Sensing, 32:542-552.
These solutions cannot satisfactorily deal with the common situation where pure or almost pure endmembers are absent from the scene. Furthermore, they do not deal well with noise in the data.