Field of the Art
The disclosure relates to the field of image processing, and more particularly to the field of extracting and modeling features from satellite-based multispectral imagery.
Discussion of the State of the Art
By utilizing historic event or training points to teach an algorithm to identify patterns and relationships associated with explanatory variables included in the analysis (factors); one can determine where those patterns are present within a broader user defined environment. Historically, the training factors used have been limited to vector based formats with limited application for raster based factors, or they have incorporated overly simplistic methods of representing the dimensionality of highly noisy image data. The vector-based factors represent portions of the human, physical, and built environments which define the study area. By examining the correlations (both spatial and attribute) between factors and events, one is able to shed light on latent relationships and patterns that may have otherwise gone unnoticed if relying on traditional geostatistical procedures.
This premise of using raster data to generate a highly dimensional space as a factor source was previously made challenging (and therefore ignored) by a lack of imagery sources that had been standardized based on spectral returns. That is, imagery that was not radiometrically calibrated over various times and areas could not be used to generate predictions. However, the inventor has a working implementation of Atmospheric Compensation (ACOMP), which mitigates and minimizes these unavoidable problems. Moreover, material selection is a critical aspect of this algorithm, because a training point is only truly represented in imagery, if the actual material that the point represents is physically present in the image. For instance, in attempting a “more like this” (MLT) classifier to discover a certain type of vegetation, the training point is only relevant if the underlying image that the model is created from actually contains that vegetation type in bloom at the time of image capture.
Through a series of case studies, the value of this new analytics opportunity has been quantified. A first case examines relationships between reflectance values and training events. A spatial event with high visibility, and a frequent collection posture, was selected; namely, marijuana fields in a particular US region, in which 8 training points were utilized and over 40 million pixels classified. The process used is somewhat similar to supervised classification with spatial proximity to training events automatically driving the cataloging of significant reflectance values, rather than a user doing so manually, although the generation of these features within this invention is done using minimally parametric cascading unsupervised pixel band ratio clustering. Those reflectance values that do not demonstrate any relationship to the training events would be ignored in subsequent phases of the process, while those areas exhibiting significant relationships with the events would be highlighted just as a vector-based factor would be. In this way, it is possible to leverage imagery to help tell a story about the surrounding geography and the spatiotemporal conditions that may be acting as drivers of a particular activity. Also, the signature of the spectrum themselves on the training point are a significant factor as well, depending on the size of the neighborhood of pixels. Furthermore, the specific implementation of this invention will respond well to cross seasonal data in some cases, as long as enough training points can be provided, because the algorithm does not merge and average training samples, rather the final SPADACC is actually a List of cube classifiers in which the best score is returned from completely separate comparisons.
Once proven, this identified reflectance signature can then be utilized to search a standardized historical archive or new standardized collection for a signature match. This would allow for spectrally-based object, area, or environment discovery. Once generated and cataloged, these reflectance signatures could be automatically run against all incoming imagery providing a “tipper” for the event in question and quickly narrowing down search space by providing more directed and productive analysis of the imagery. As this analytics technique becomes more pervasive, the library of these signatures will continue to grow and in turn further expand our understanding of geographic influence on behavior.
A secondary case study used imagery as a standalone factor for analyzing environmental phenomenon. By looking at the relationship between surface reflectance values and environmental occurrences (e.g. sink holes, subsidence, landslides), it was determined that one can leverage imagery as a standalone factor to predict environmental conditions such as potential mining locations, habitat suitability, or substrate stability. This technique offers benefits to traditional analytical workflows that are being conducted in regions with little vector data to act as factors (for example, emerging areas of interest that have little geospatial statistical data about demographics and economics available, but for which satellite imagery is available). Filling these data gaps with imagery will drastically reduce limitations associated with lean data areas of interest and significantly expand the geography within which one can effectively conduct predictive analysis. Moreover, with this invention, the exact spectral characteristics are combined with the characteristics of surrounding features, as described by pixel clustering, depending on the configuration of the buffer of pixels around each training point that the models are based on.
At the core of this analytical benefit is the relationship of the location of events to the surface reflectance values across an image with standard characteristics. This application does not require that spectral signatures that match ground surface features be determined in advance in order to classify the surface in relation to the events. What it does put forward is that the normalized pixel values can be utilized to determine if the events have certain coincident pixel characteristics in common across all the pixels within an area of interest (or even globally). These methods are not possible unless the imagery utilized is standardized based on surface reflectance in a way that ensures that signatures match across temporal and geographic variations in collects. Furthermore, the current implementation of this invention does not use simple NND, it uses N dimensions of distance and direction as features.
To address the problem of normalization of pixel values across image captures from the satellite, the invention uses a process called “Atmospheric Compensation” (ACOMP), as this provides the best data source for the invented algorithm due to its spectral normalization.