This invention relates generally to image processing systems and more particularly to an image anomaly detector for target identification.
Hyperspectral sensors are a new class of optical sensor that collect a spectrum from each point in a scene. They differ from multi-spectral sensors in that the number of bands is much higher (twenty or more), and the spectral bands are contiguous. For remote sensing applications, they are typically deployed on either aircraft or satellites. The data product from a hyperspectral sensor is a three-dimensional array or “cube” of data with the width and length of the array corresponding to spatial dimensions and the spectrum of each point as the third dimension. Hyperspectral sensors have a wide range of remote sensing applications including: terrain classification, environmental monitoring, agricultural monitoring, geological exploration, and surveillance. They have also been used to create spectral images of biological material for the detection of disease and other applications.
With the introduction of sensors capable of high spatial and spectral resolution, there has been an increasing interest in using spectral imagery to detect small objects or features of interest. Anomaly detection algorithms are used to distinguish observations from the background when target models are not available or are unreliable.
Anomalies are defined with reference to a model of the background. Background models are developed adaptively using reference data from either a local neighborhood of the test pixel or a large section of the image.
An older method of detecting anomalies from multispectral and hyperspectral imagery is to represent the background imagery using Gaussian mixture models and to use detection statistics derived from this model by applying various principles of detection theory. This approach models each datum as a realization of a random vector having one of several possible multivariate Gaussian distributions. If each observation, y∈Rn, arises from one of d normal classes then the data have a normal or Gaussian mixture pdf:
                                          p            ⁡                          (              y              )                                =                                    ∑                              k                =                1                            d                        ⁢                                          ω                k                            ⁢                              N                ⁡                                  (                                                            μ                      k                                        ,                                          Γ                      k                                                        )                                            ⁢                              (                y                )                                                    ,                              ω            k                    ≥          0                ,                                            ∑                              k                =                1                            d                        ⁢                          ω              k                                =          1                ,                            [                  Eqn          .                                          ⁢          1                ]            where ωk is the probability of class k and
            N      ⁡              (                              μ            k                    ,                      Γ            k                          )              ⁢          (      y      )        =            1                                    (                          2              ⁢                                                          ⁢              π                        )                                n            /            2                          ⁢                                                        Γ              k                                                        1            /            2                                ⁢          exp      ⁡              (                                            -              1                        2                    ⁢                                    (                              y                -                                  μ                  k                                            )                        t                    ⁢                                    Γ              k                              -                1                                      ⁡                          (                              y                -                                  μ                  k                                            )                                      )            is the normal probability density function having mean μk and covariance Γk. The parameters {(ωk,μk,Γk)|1≦k≦d} are typically estimated from the imagery using defined clusters, the expectation maximization algorithm or related algorithms such as the stochastic expectation maximization algorithm. Anomaly detection may then proceed by the application of the generalized likelihood ratio test (GLRT) for an unknown target. Anomaly detection is also accomplished by classifying each pixel as emanating from one of the d classes—the maximum a posteriori (MAP) principle is one approach to classification—and applying the GLRT for an unknown target to the classified data. See, for example, D. W. J. Stein, S. G. Beaven, L. E. Hoff, E. M. Winter, A. P. Schaum, A. D. Stocker, “Anomaly Detection From Hyperspectral Imagery,” IEEE Signal Processing Magazine, January 2001.
Another older approach to anomaly detection is based on the application of the linear mixture model. This model accounts for the fact that pixels in an image are often overlaid with multiple materials so that an observation may not belong to a class that can be identified with a particular substance. The linear mixture model represents the observations, yi∈Rn,
by
                                                                                                                                                                                                                  y                            i                                                    =                                                      η                            +                                                                                          ∑                                                                  k                                  =                                  1                                                                d                                                            ⁢                                                                                                a                                  ki                                                                ⁢                                                                  ɛ                                  k                                                                ⁢                                                                                                                                  ⁢                                such                                ⁢                                                                                                                                  ⁢                                that                                                                                                                                    ⁢                                                                                                  ⁢                                                                                                  ⁢                                                  c                          ⁢                          .1                                                                    )                                        ⁢                                                                                  ⁢                    0                                    ≤                                      a                    ki                                                  ,                and                            ⁢                                                          ⁢                                                          ⁢                              c                ⁢                .2                                      )                    ⁢                                          ⁢                                    ∑                              k                =                1                            d                        ⁢                          a              ki                                      =        1                            [                  Eqn          .                                          ⁢          2                ]            where, d is the number of classes, εk∈Rn, is the signature or endmember of class k, aki is the abundance of class k in observation yi and η˜N(μ0,Γ0) is an additive noise term with normal probability distribution function (pdf) of mean μ0 and covariance Γ0. Techniques have been developed for estimating the endmembers from the imagery. Given the endmembers, the abundance values are obtained as the solution to a constrained least squares or a quadratic programming problem. Anomaly detection statistics have been based on the unmixing residual or the identification of endmembers that represent anomalous classes (see Stein et al supra).
Spectra from a class of material are often better modeled as random rather than as fixed vectors. This may be due to biochemical and biophysical variability of materials in a scene. For such data, neither the linear mixture model nor the normal mixture model is adequate, and better classification and detection results may accrue from using more accurate methods. Stocker et al. [A. D. Stocker and A. P. Schaum, “Application of stochastic mixing models to hyperspectral detection problems,” SPIE Proceedings 3071, Algorithms for Multispectral and Hyperspectral Imagery III, S. S. Shen and A. E. Iverson eds. August 1997.] propose a stochastic mixture model in which each fundamental class is identified with a normally distributed random variable, i.e.
      y    i    =            ∑              k        =        1            d        ⁢                  a        ik            ⁢              ɛ        k            such that εk˜N(μk, Γk), aik>0, and
                                          ∑                          k              =              1                        d                    ⁢                      a            ik                          =        1.                            [                  Eqn          .                                          ⁢          3                ]            They estimate the parameters of the model by quantizing the set of allowed abundance values, and fitting a discrete normal mixture density to the data. More precisely, let Δ=1/M denote the resolution of the quantization. Then the set of allowed coefficient sequences is
  A  =            {                                    (                                          a                1                            ,              …              ⁢                                                          ,                              a                d                                      )                    ❘                                    ∑                              j                =                1                            d                        ⁢                          a              j                                      =                              1            ⁢                                                  ⁢            and            ⁢                                                  ⁢                          a              j                                ∈                      {                          0              ,              Δ              ,              …              ⁢                                                          ,                                                (                                      M                    -                    1                                    )                                ⁢                Δ                            ,              1                        }                              }        .  
For each {right arrow over (a)}=(a1, . . . , ad)∈A define
                                          μ            ⁡                          (                              a                _                            )                                =                                    ∑                              j                =                1                            d                        ⁢                          a              j                                      ,                                            μ              j                        ⁢                                                  ⁢            and            ⁢                                                  ⁢                          Γ              ⁡                              (                                  a                  →                                )                                              =                                    ∑                              j                =                1                            d                        ⁢                                          a                j                2                            ⁢                                                Γ                  j                                .                                                                        [                  Eqn          .                                          ⁢          4                ]            Then the observations are fit to the mixture model
                              p          ⁡                      (            y            )                          =                              ∑                          a              ∈              A                                ⁢                                    ρ              a                        ⁢                          N              ⁡                              (                                                      μ                    ⁡                                          (                                              a                        →                                            )                                                        ,                                      Γ                    ⁡                                          (                                              a                        →                                            )                                                                      )                                      ⁢                          (              y              )                                                          [                  Eqn          .                                          ⁢          5                ]            
The fitting is accomplished using a variation of the stochastic expectation maximization algorithm such that Eqn. 4 is satisfied in a least squares sense. The authors demonstrate improved classification in comparison with clustering methods using three classes, and they demonstrate detection algorithms using this model. They note, however, that the method is impractical if the data are comprised of a large number of classes or if Δ is small, as the number of elements of A, which is given by:
                                        A                          =                                            (                              M                +                1                            )                        ⁢                          …              ⁡                              (                                  M                  +                  d                  -                  1                                )                                                                        (                              d                -                1                            )                        !                                              [                  Eqn          .                                          ⁢          6                ]            becomes very large. Furthermore, quantizing the allowed abundance values leads to modeling and estimation error.
These unresolved problems and deficiencies are clearly felt in the art and are solved by this invention in the manner described below.