The present invention relates generally to methods and systems for detection of agents such as pathogens or toxic substances and, in particular, to methods and systems for determining the most important background constituents to suppress in a bulk aerosol sample in order to reduce the probability of false alarms and improve the level of detection of potentially harmful airborne agents.
The detection of bio-aerosol warfare agents in the presence of either indoor or outdoor backgrounds is a difficult problem. Natural backgrounds are variable and can simultaneously include mixtures of multiple constituents. The variation of each constituent may be larger than the concentration level of an agent whose detection is desired. The detection problems can be further exacerbated by the presence of unpredictable spikes in measurement data of a naturally-occurring background, which may be an order of magnitude larger than the contribution of the normal quiescent background. Such spikes may last for minutes and may exhibit large variations in particle count. The “spike” problem means that temporal filters using recent particle count history to set a detection threshold will not work.
A high false alarm rate creates problems for a bio-aerosol detection system. Repeated false alarms will cause people to panic or begin to ignore warnings. High regret actions, such as building evacuation or administering antibiotics are expensive and create logistics problems if they occur often.
Some bio-aerosol detection systems comprise a trigger plus a confirmation sensor. The trigger is a low-cost, non-specific detection system which runs continuously. The confirmation sensor has high specificity to identify specific bio-agents, and runs only when it is triggered. Typically, confirmation sensors are expensive to operate relative to trigger sensors, and may have logistics requirements for reagents, fluid consumption, etc. A high trigger false alarm rate will drive up the confirmation sensor operating cost. Typically, confirmation sensors will also take longer to provide a result than a trigger sensor. Thus, a trigger sensor with low false alarm rate may be used for low regret actions that need to be taken quickly to be effective such as temporary shut down of a building heat/ventilation/air conditioning system.
One approach to a trigger sensor is to collect a bulk sample, immobilize it, and make high-dimensional measurements of some property of the sample. For example, the high-dimensional space may be the spectrum of reflected or transmitted radiation or the emission spectrum of fluorescence induced by short wavelength illumination. The high-dimensional space may also be the result of concatenated spectra from separate measurements, such as the fluorescence excited by different illumination wavelengths.
Principal component (PC) analysis is a method of reducing the dimensionality of data so that it may be more easily visualized or analyzed. This well-known method uses a data set to determine the direction in the high-dimensional space with the largest variance, the orthogonal direction with the next largest variance, etc., until the remaining dimensions contain only random noise. Each orthogonal direction becomes a component in PC space. Converting additional measurements in the high-dimensional space into PC space is simply a matter of a matrix multiplication once the PC directions are known.
In many cases, there are more than three meaningful principal components. Visualization becomes difficult because at most three principal components can be shown at one time. Viewing multiple graphs provides some indication of the separation of two principal component vectors, but a quantitative measure of the separation is also very useful. One measure, borrowed from hyperspectral imaging, is the spectral angle between two vectors. This angle is defined as the inverse cosine of the normalized dot product of the two vectors. For two vectors Mi, Mj, the spectral angle between them is given by:
      SA          i      ,      j        =                    cos                  -          1                    ⁡              (                                            M              i                        ·                          M              j                                                                                        M                i                                                    ⁢                                                        M                j                                                                  )              .  In hyperspectral imaging work, the components of M typically represent raw spectral measurements. Spectral angles can be used to measure separation of two vectors in principal component space. FIG. 5 shows an example of the spectral angles between pairs of interferents and simulants (or agents). This matrix is an example of a Euclidean distance matrix. This matrix can describe distances between vectors which can be plotted on a two-dimensional surface (like the mileage chart on a road map of a small region), or the distances may require a three- or higher-dimensional surface for consistent plotting of the vectors. For example, the mileage chart between cities all over the world would require a spherical surface to place the cities such that all distances were consistent with the mileage chart. The two, three, or higher dimensional space defined by the Euclidean distance matrix of spectral angles is one example of a Simplex, a convex shape defined by corners and edges in a multi-dimensional space.
A linear mixing model provides an appropriate description for the principal components of a typical bio-aerosol, either in-situ or collected and concentrated into a bulk sample. This model also applies to mixtures found on surfaces. The linear mixing model has been used extensively in hyperspectral imaging, where it has been used to describe the measured spectral values directly. The PC values derived from measured spectral values are given by
      M    i    =                    ∑        j            ⁢                        a          j                ⁢                  E          ij                      +    N  whereaj is the abundance coefficient of the jth constituent, andEij is the ith principal component of the jth constituent, andN is a matrix of noise components.
In the model, the values of E for the jth constituent are often referred to as endmembers. These endmembers can be either background constituents, such as pollen, fungal spores, diesel particulates, etc, or they can be chemical or biological agents that we wish to detect. In some cases, simulants can take the place of agents. These simulants are chosen to have signatures which are very similar to the agents that we wish to detect but which are too dangerous to be used in tests. Background constituents which are not agents are often referred to as interferents.
Libraries can be created for agents, simulants, and interferents. These libraries can be created by making measurements of pure substances or by making measurements of real backgrounds. Measurements of pure substances can be made at high signal to noise, under laboratory conditions, with no other background interferents to corrupt the measurements. Pure agents and simulants may be easy to obtain, but pure samples of background constituents must be collected and isolated. Measurement of real backgrounds will not require collection and isolation of individual background constituents, but the signatures of the individual constituents must be separated after detection. This separation of measured data into signatures for individual constituents is one of the important aspects of our invention.
Rotate and suppress (RAS) is a technique to solve the mixture and spike problems. For further details on RAS techniques, see P. C. Trepagnier and P. D. Henshaw, “Principal Component Analysis Incorporating Excitation, Emission, and Lifetime Data of Fluorescent Bio-Aerosols,” PhAST Conference, Long Beach Calif., May 22-25, 2006; P. D. Henshaw and P. C. Trepagnier, “Background Suppression and Agent Detection in Multi-Dimensional Spaces,” PhAST Conference, Long Beach Calif., May 22-25, 2006; P. C. Trepagnier, P. D. Henshaw, R. F. Dillon, and D. P. McCampbell, “A Fluorescent Bio-Aerosol Point Detector Incorporating Excitation, Emission, And Lifetime Data,” SPIE Photonics East, Boston Mass. Oct. 1-4, 2006; P. D. Henshaw and P. C. Trepagnier, “Real-time Determination and Suppression of Bio-Aerosol Constituents,” SPIE Photonics East, Boston Mass. Oct. 1-4, 2006; P. D. Henshaw and P. C. Trepagnier, “False Alarm Reduction Algorithms for Standoff Detection,” Williamsburg Standoff Detection Conference, Williamsburg Va., Oct. 23-27, 2006 and U.S. patent application Ser. No. 11/541,935, Filed Oct. 2, 2006, entitled “Agent Detection in the Presence of Background Clutter,” by P. D. Henshaw and P. C. Trepagnier, all of which are incorporated herein in heir entirety.
To suppress a single background constituent which may have large, unpredictable variations in particle count, we rotate the PC space so that the background constituent is aligned with one of the PC axes. We then drop that axis, eliminating the effect of large particle counts and variations of particle count of that background constituent. If we have multiple background constituents that we wish to eliminate, this process can be repeated. The result is that we trade one PC dimension for each background constituent that we wish to suppress. Because the number of PCs is limited, this means we must choose a subset of the possible interferents to suppress because we cannot suppress an unlimited number of them. The suppression list contains the list of constituents to suppress using RAS. The suppression list can be derived from recent measurements, selected from a library, or a combination of the two. A key aspect of our invention is the strategy of selecting members of the suppression list. In the remainder of our teaching, we will often refer informally to the members of the suppression list as {X} and the maximum length of the suppression list as X.
The “mixture problem” refers to the fact that a spectral measurement M resulting from a mixture of constituents will not be in any of the libraries, and thus will not be directly identifiable as either an agent, a simulant, or an interferent.
An agent detection system must deal with the background environment under different conditions. The system must work very quickly after setup in uncharacterized locations and seasons, for example in battlefield conditions. Performance should be acceptable even without a priori knowledge of the background. Because false alarm rate is a very important parameter for an agent detection system, the system must be able to incorporate limited a priori knowledge of background to improve false alarm performance. This knowledge might include a background library created from measurements in a similar environment, or knowledge that one important background constituent is always present. The system should be able to select constituents to suppress from the background library based on a small number of background measurements. Finally, the agent detection system should be able to improve its false alarm rate over time by learning the background.
Substances known to be present in the background in certain regions of the country are available in pure form from chemical suppliers. These substance include “Arizona road dust,” from Powder Technology, Inc., fungal spores (“Alternaria alternata”), tree pollen (“Sycamore Eastern Defatted”), grass pollen (“Kentucky Blue Defatted”), “House Dust,” and “Upholstery Dust,” for example, all available from Greer Source Materials, Lenoir, N.C.
A Government-funded program known as “Bug Trap” collects individual particles, determines which fluoresce, and identifies these as potential background interferents. (Further details can be found on the DARPA website.) The program does not determine the principal components of the fluorescence, but does determine the type of particle if possible. Once the particle type is identified, measurements of pure substances obtained from chemical suppliers could be measured to determine their spectra and resulting principal components.
Hyperspectral imaging (HSI) of the earth's surface has many similarities to agent detection systems. These similarities include the form of the raw data (spectra), background interferents, and the mixture problem. There are important differences between HSI and agent detection, however. First, the images obtained using HSI systems typically have a very large number of pixels (measurements). Our method must work with a smaller number of measurements (tens to hundreds rather than 10,000+). Also HSI must deal with shade problems and atmospheric transmission problems which are not issues for bio-aerosols. Finally, HSI analysis typically includes the time to do field work to identify and measure pure substances (ground truth). (For further details, see N. Keshava, “A Survey of Spectral Unmixing Algorithms,” Lincoln Laboratory Journal 14 (2003) p. 55.)
Mathematical approaches to determining endmembers developed for HSI include a shrink wrap approach and a simplex approach. In general, these methods tend to underestimate the extent of the distribution, resulting in endmembers which are still mixtures.
Accordingly, there is a need for determination of the members of a suppression list to be used with the RAS background suppression method from a limited number of measured values, with or without a priori information, where the suppression list members will be the most important endmembers of the local, current background mixture.