Details about electromagnetic properties of material are needed to establish those material characteristics suited to particular purposes such as camouflage, concealment and deception. One way of obtaining these details is to energize material with select forms of electromagnetic energy and observe the results using spectral analysis.
Several well known data handling techniques are used for spectral analysis in the optical portion of the electromagnetic spectrum including: Classical Least Squares, Partial Least Squares, Beer's Approximation and Principal Component Regression. Infrared Quantitative Analysis, Nicolet Instrument Corporation, Madison, Wis. 1997. Straightforward use of any of these techniques requires optimization of instrumentation for the particular technique. One may wish to apply two or more of the techniques to the same data. The optimum instrumentation configuration may vary with the desired result, e.g., to obtain the optimum signal-to-noise ratio (SNR). This often leads to a unique, yet expensive and labor-intensive, solution, e.g., taking the same data with different instrumentation configurations or settings.
It is known that signals can be encoded and decoded efficiently to reduce the amount of data handled, such as number of bits, bandwidth, or storage, while retaining salient characteristics of the decoded signal. Examples include audio and video bandwidth compression schemes. Jack, Keith, Video Demystified: a Handbook for the Digital Engineer, High Text Interactive, Inc., San Diego, Calif., 1996.
A characteristic frequency response is often used to analyze materials, as is coupling thereto an associated temporal or spatial interval, or both. Such analyses may be done using a windowed Fourier Transform such that different sized windows correspond to the scale of the desired transient feature. This method correlates the signal to all windowed exponentials and checks for significant correlations. Because measurements may not be independent, information thus obtained may be redundant.
One method of reducing computation costs while maintaining accuracy is described in U.S. Pat. No. 5,526,299, Method and Apparatus for Encoding and Decoding Using Wavelet-Packets, to Coifman et al., Jun. 11, 1996, incorporated herein by reference. This method uses a library of modulated wavelet packets (combinations of dilations and translations) to extract features by correlating this library with a signal of interest while maintaining orthogonality of the set of waveforms thus selected.
Wavelets are mathematical functions that separate data into different frequency components, allowing each component to be analyzed with a resolution matched to the component's scale. They have advantages over traditional Fourier methods in analyzing physical situations where the signal contains discontinuities and sharp spikes, such as with spectral data collected using an interferometer.
Wavelet algorithms process data at different scales or resolutions, e.g., a signal viewed from a large “data window” exhibits but gross features, whereas a signal viewed from a small data window allows detailed features to be seen. Wavelet analysis enables use of approximating functions with non-zero mean values in finite domains and zero mean value elsewhere.
The wavelet analysis procedure adopts a wavelet prototype function, termed an “analyzing wavelet” or “mother wavelet.” Temporal analysis (translation) is performed with a contracted, high frequency version of the mother wavelet, while frequency analysis (dilation) is performed with a dilated, low frequency version of the same wavelet. The combination of dilation and contraction, or translation, is done in what is termed a wavelet packet. Wavelets are zero mean value orthogonal basis functions that are non-zero within a defined space and time. They are used to transform an operator by applying to the operator a finite number of scales (dilations) and positions (translations), yielding transform coefficients to populate a matrix. Feature extraction is enabled by correlating a library of waveforms, or waveforms taken from a known source, with the signal of interest, while maintaining orthogonality of the selected set of waveforms.
Because the original signal or function can be represented in terms of a wavelet expansion (using coefficients in a linear combination of the wavelet functions), data operations can be performed using just the corresponding wavelet coefficients. Further, by choosing the best wavelets adapted to your data as in a “best tree” approach, or truncating the coefficients below a threshold, data may be sparsely represented. This sparse coding makes wavelets an excellent tool in the field of data compression or when economy of computational resources is desired.
Generically speaking, wavelets are produced by constructing a basis function, shifting it by some amount, and changing its scale. Then that structure is applied in approximating a signal. The procedure is repeated by again taking the basic structure, shifting it, and scaling it. Applying this to the same signal yields a new approximation. This procedure is repeated until a desired result is achieved. An inherent advantage of this “scaled analysis” is its relative insensitivity to noise because it measures the average fluctuations of the signal at different, yet appropriate, scales.
A basic wavelet is the Haar wavelet, a property of which is “compact support,” meaning that it vanishes outside of a finite interval. Haar wavelets are not continuously differentiable which somewhat limits their applications.
A basis function may be explained by reference to digital analysis and vectors. Every two-dimensional vector (x, y) is a combination of the vector (1,0) and (0,1). These two vectors are the basis vectors for (x, y) since x multiplied by (1,0) is the vector (x, 0), and y multiplied by (0,1) is the vector (0,y). The sum is (x, y). These basis vectors have the inherent valuable property of orthogonality.
These concepts may be related to basis functions. Instead of the vector (x, y), we have a function ƒ(x). Imagine that ƒ(x) is a spectral response, say the frequency A of a particular material's response. A may be constructed by adding sines and cosines using combinations of amplitudes and frequencies. The sines and cosines are the basis functions in this example (and also the elements of Fourier synthesis). An additional requirement may be imposed in that these sines and cosines be orthogonal. This is accomplished by choosing the appropriate combination of sine and cosine terms whose inner products add to zero. Thus, the particular set of functions that are orthogonal and that construct ƒ(x) constitute appropriate orthogonal basis functions.
Windowing can be understood by what is done to reduce the number of calculations and increase the accuracy in Fourier transforms. If ƒ(t) is a non-periodic signal, the summation of the periodic functions, sine and cosine, does not accurately represent the signal. The signal may be artificially extended to make it periodic, but this would require additional continuity at the endpoints. The windowed Fourier transform (WFT) is one solution to representing a non-periodic signal. The WFT separates an input signal ƒ(t) into sections. Each section is analyzed for its frequency content separately. If the signal has sharp transitions, the input data is “windowed” so that the sections converge to zero at the endpoints. This windowing is accomplished via a weight function that places less emphasis near the interval's endpoints than in the middle. The effect of the window is to localize the signal in time.
To approximate a function by samples, and to approximate the Fourier integral by the discrete Fourier transform, requires applying a matrix whose order is the number of sample points, n. Since multiplying an n×n matrix by a vector requires on the order of n2 arithmetic operations, the problem worsens as the number of sample points increases. However, if the samples are uniformly spaced, then the Fourier matrix may be factored into a product of just a few sparse matrices. The resulting factors may be applied to a vector in a total of order n log n arithmetic operations, i.e., the Fast Fourier Transform (FFT). By analogy to FFT, wavelets may be packaged as “packets” and analysis continue in a manner similar to the FFT while taking advantage of the unique capabilities of wavelet analysis.
A basis function varies in scale by “dissecting” the same function or data space using different scale sizes. For example, a signal in the domain from 0 to 1 may be represented using two step functions from 0 to ½ and ½ to 1. The original signal may be divided again using four step functions from 0 to ¼, ¼ to ½, ½ to ¾, and ¾ to 1. And so on, each set of representations coding the original signal with a particular resolution or scale. There are other similarities between Fourier and wavelet transforms.
The fast Fourier transform (FFT) and the discrete wavelet transform (DWT) are both linear operations that generate a data structure that contains log2 n segments of various lengths, usually filling and transforming the data structure into a different data vector of length 2n. The mathematical properties of the matrices involved in the transforms are similar as well. The inverse transform matrix for both the FFT and the DWT is the transpose of the original. As a result, both transforms can be viewed as a rotation in function space to a different domain. For the FFT, this new domain contains basis functions that are sines and cosines. For the wavelet transform, this new domain contains more complicated basis functions called wavelets, mother wavelets, or analyzing wavelets.
Both transforms have another similarity. The basis functions are localized in frequency, making mathematical tools such as power spectra (how much power is contained in a frequency interval) and “scalegrams” useful at picking out frequencies and calculating power distributions.
A scalegram of a time series is the average of the squares of the wavelet coefficients at a given scale. Plotted as a function of scale, it depicts much of the same information as does the Fourier power spectrum plotted as a function of frequency. Implementing the scalegram involves summing the product of the data with a wavelet function, while implementing the Fourier power spectrum involves summing the data with a sine or cosine function. The formulation of the scalegram makes it a more convenient tool than the Fourier transform because certain relationships between the different time scales become easier to see and correct, such as seeing and correcting for photon noise.
There are basic dissimilarities between Fourier and wavelet transforms that lead to a fuller understanding of the benefits of using wavelet packets in an embodiment of the present invention.
A basic dissimilarity is that individual wavelet functions are localized in space. Fourier sine and cosine functions are not. This localization feature, along with a wavelets' localization of frequency, makes many functions and operators using wavelets “sparse” when transformed into the wavelet domain. This sparseness, in turn, results in a number of useful applications such as data compression, practical detection of features in images, and noise removal from a time series.
One way to see the time-frequency resolution differences between the Fourier transform and the wavelet transform is to look at the basis function coverage of the time-frequency plane. FIG. 1 shows a windowed Fourier transform, where the window is simply a square wave. It shows Fourier basis functions, time-frequency tiles, and coverage within the time-frequency plane. The square wave window truncates the sine or cosine function to fit a window of a particular width. Because a single window is used for all frequencies in the WFT, the resolution of the analysis is the same at all locations in the time-frequency plane.
An advantage of wavelet transforms is that the windows vary. To isolate signal discontinuities, very short basis functions are desirable. Conversely, to obtain detailed frequency analysis, some very long basis functions are desirable. A way to achieve this is to have short high-frequency basis functions and long low-frequency ones, exactly what wavelet transforms provide. FIG. 2 shows the coverage in the time-frequency plane with one wavelet function (Daubechies wavelet basis functions), time-frequency tiles, and coverage within the time-frequency plane.
Note that wavelet transforms do not have a single set of basis functions like the Fourier transform that utilizes just the sine and cosine functions. Instead, wavelet transforms have an infinite set of possible basis functions. Thus wavelet analysis provides immediate access to information that can be obscured by other time-frequency methods such as Fourier analysis. Wavelet transforms comprise an infinite set. The different wavelet families make trade-offs between how compactly the basis functions are localized in space and how smooth they are. Within each family of wavelets (such as the Daubechies family) are wavelet subclasses distinguished by the number of coefficients and by the level of iteration. Wavelets are classified within a family most often by the number of vanishing moments. This is an extra set of mathematical relationships for the coefficients that must be satisfied, and is directly related to the number of coefficients. For example, within the Coiflet wavelet family are Coiflets with two vanishing moments and Coiflets with three vanishing moments. FIG. 3 illustrates several different wavelet families.
Dilations and translations of the mother wavelet, or analyzing wavelet, Φ(x), define an orthogonal basis, or wavelet basis:                                           Φ                          (                              s                ,                l                            )                                ⁡                      (            x            )                          =                              2                                          -                s                            2                                ⁢                      Φ            ⁡                          (                                                                    2                                          -                      s                                                        ⁢                  x                                -                l                            )                                                          (        1        )            
The variables s and l are integers that scale and dilate, respectively, the mother function Φ(x) to generate wavelets, such as a Daubechies wavelet family. The scale index, s, indicates the wavelet's width, and the location index, l, gives its position. Notice that the mother wavelet functions are re-scaled, or “dilated” by powers of two (2±s), and “translated” by integers (l). Once the mother wavelet functions are known, everything is known about the basis.
To span the data domain at different resolutions, the analyzing (mother) wavelet is used in a scaling equation:                               W          ⁡                      (            x            )                          =                              ∑                          k              =              1                                      n              -              2                                ⁢                                           ⁢                                                    (                                  -                  1                                )                            k                        ⁢                          C                              k                +                1                                      ⁢                          Φ              ⁡                              (                                                      2                    ⁢                    x                                    +                  k                                )                                                                        (        2        )            where W(x) is the scaling function for the mother function Φ(x), and Ck represents the wavelet coefficients. The wavelet coefficients satisfy linear and quadratic constraints of the form                                                         ∑                              k                =                0                                            N                -                1                                      ⁢                                                   ⁢                          C              k                                =          2                ,                                            ∑                              k                =                0                                            N                -                1                                      ⁢                                                   ⁢                                          C                k                            ⁢                              C                                  k                  +                                      2                    ⁢                    l                                                                                =                      2            ⁢                          δ                              l                ,                0                                                                        (        3        )            where δ is the delta function and l is the location index.
One of the most useful features of wavelets is the ease with which one may choose the defining coefficients for a given wavelet system to be adapted for a given problem. It is helpful to think of the coefficients {Co, . . . , Ck} as a filter. The filter, or coefficients, are placed in a transformation matrix that is applied to a raw data vector. The coefficients are ordered using two dominant patterns, one that works as a smoothing filter (like a moving average), and one pattern that works to bring out the data's “detail” information.
The matrix of the DWT may be applied in a hierarchical algorithm, sometimes termed a pyramidal algorithm. The wavelet coefficients are arranged so that odd rows contain an ordering of wavelet coefficients that act as the smoothing filter, and the even rows contain an ordering of wavelet coefficients with different signs that act to bring out the data's detail. The matrix is first applied to the original, full-length vector. Then the vector is smoothed and “decimated” by half and the matrix is applied again. Then the smoothed, halved vector is smoothed, and halved again, and the matrix applied once more. This process continues until a trivial number of “smooth-smooth-smooth . . . ” data remain. That is, each matrix application brings out a higher resolution of the data while at the same time smoothing the remaining data. The output of the DWT consists of the remaining “smooth” components, and all of the accumulated “detail” components.
In general, the DWT matrix is not sparse, so it has complexity similar to a discrete Fourier transform. As for the FFT, complexity is addressed by factoring the DWT into a product of a few sparse matrices using self-similarity properties. The result is an algorithm that requires only an order of n operations to transform an n-sample vector. This is the “fast” DWT of Mallat and Daubechies.
The wavelet transform is a subset of a far more versatile transform, the wavelet packet transform. Wavelet packets, identical to nodes in the trees of the '299 patent, are particular linear combinations of wavelets. They form bases that retain many of the properties of their parent wavelets such as orthogonality, smoothness, and localization. Their coefficients are computed by a recursive algorithm, making each newly computed wavelet packet coefficient sequence the root of its own analysis tree.
Because there is a choice among an infinite set of basis functions, one desires to find the best basis function for a given representation of a signal. A “basis of adapted waveform” is the “best basis” function for a given signal representation. The chosen basis carries substantial information about the signal, and if the basis description is efficient (that is, very few terms in the expansion are needed to represent the signal), then that signal information has been compressed. Some desirable properties for adapted wavelet bases (using the basis of adapted waveform) are:                fast computation of inner products with the other basis functions;        fast superposition of the basis functions;        good spatial localization, so one may identify the position of a signal that is contributing a large component;        good frequency localization, so one may identify signal oscillations; and        independence, so that not too many basis elements match the same portion of the signal; i.e., minimal overlap or redundancy.        
For adapted waveform analysis, one seeks a basis in which the coefficients, when rearranged in decreasing order, decrease as rapidly as possible. To measure rates of decrease, one uses tools from classical harmonic analysis including calculation of information cost functions. This is defined as the expense of storing the chosen representation. Examples of such functions include the number above a threshold, concentration, Shannon's entropy, logarithm of energy, Gauss-Markov calculations, and the theoretical dimension of a sequence. An embodiment of the present invention uses Shannon's entropy.
The '299 patent uses a library of modulated wavelet packets, i.e., combinations of dilations (as related to time) and translations (as related to space) of a wavelet, that are efficient in providing both temporal and spatial localization.
Steps include: applying combinations of dilations and translations to the input signal to obtain processed values; computing the information costs of the processed values; selecting, as encoded signals, an orthogonal group of processed values, the selection being dependent on the computed information costs; and decoding the encoded signals to obtain an output signal. Ideally, the wavelets selected have a reasonable number of vanishing moments. The step of applying combinations of dilations and translations of the wavelet, i.e., wavelet packets, to the input signal comprises correlating combinations of dilations and translations of the wavelet with the signal of interest.
Applying wavelet packets to a signal of interest to obtain processed values includes generating a tree of processed values. The tree has successive levels obtained by applying to the signal of interest, for a given level, wavelet packets that are combinations of the wavelet packets applied at a previous level. The steps of computing information costs and selecting an orthogonal group of processed values include computing at a number of different levels of the tree, and selecting from among the different levels of the tree to obtain an orthogonal group having a minimal information cost, i.e., the “best basis” or “best tree” solution. The step of selecting an orthogonal group of processed values includes generating encoded signals that represent the processed values associated with their respective locations in the tree. These techniques may be adapted to any number of applications including detection of small amounts of elements in material.
If a minute amount of foreign material, e.g., a contaminant, is present in a material, a conventionally generated optical spectra of the contaminated version of the material may appear very similar to that of the non-contaminated version. Thus, without an analytic method yielding very precise measurements, dissembled change is not identified or even detected. Notably, these analyses will be greatly compromised if the available spectra data are noisy. Another drawback inherent in the straightforward use of non-wavelet packet data handling techniques is the requirement for use of a specific resolution based on the material being analyzed. More importantly, even when these techniques are able to detect the presence of contamination, they are not able to localize it, i.e., provide a spatial measure. Accordingly there is a need for an efficient, low cost technique that is somewhat independent of detector resolution, requires no updating to optimize parameters, and provides a reliable spatial measure along with spectral detection and classification.