Spectral data consists of multiple interrelated data points, such as an optical spectrum or a chromatogram, which carries information related to the components and characteristics of the specimen from which the data was derived, as well as to performance of the analytical instrument and to the general experimental conditions. In spectroscopy, for example, this specimen is a material and the spectral data comprises the results of related measurements made on the specimen as a function of a variable, such as the frequency or wavelength of the energy used for measurement. In chromatography, the spectral variable may be time or distance. In thermal analysis, the variable is usually temperature or time. In mechanical vibration/acoustics analysis the variable is usually frequency. In rheology the variable can be position, shear rate or time. In electrophoresis and thin layer chromatography the variable is relative distance in one or two dimensions. In many different analyses, e.g. kinetic measurements, time is either the primary variable or an additional variable that adds to the dimensionality of the data.
In image analysis the fundamental variable is usually distance in one or two dimensions although the two-dimensional Fourier transform, also known as the Weiner transform, and the Weiner spectrum which express the information in a two-dimensional spatial frequency domain are also prevalent. Multivariate images, such as three color video signals and many satellite images where each picture element is characterized by a multichannel "spectrum" and also images constituting a time sequence of information, provide additional dimensionality in the data. Alternatively, in image analysis, the images can be summarized into histograms showing distributions of various picture elements, where the variable is then a vector of categories, each representing a class of picture elements, e.g. various gray levels of pixels, or contextual classes based on local image geometry. For multivariate images, the additional multichannel information may be included in the contextual classification. Time information can likewise be included in the definition of the categories in the variable. The above descriptions of two-way images also apply to three-way tomographic images, e.g. in MRI and X-ray tomography.
It should also be noted that it is possible to express spectral data more or less equivalently in several different domains, i.e. with respect to several different variables, for example by a Fourier, Weiner, or Hadamard transformation, and using different metrics, such as Euclidian and Mahalanobis distances.
For all of the above types of spectral data, information from several specimens may be related to each other statistically to derive analytical information. In order to derive specific desired analytical information, such as the concentration of a constituent, the magnitude of a physical property, or the identification of the specimen or its components, one form or another of additive multivariate approximation or modeling is typically employed. For example, a desired parameter may be modeled as the suitably weighted sum of the measurements at selected data points within the spectrum or a weighted sum of previously determined reference spectra. The weighting coefficients, sometimes referred to as the calibration coefficients, are statistically determined based on spectral data obtained from a set of calibration specimens for which the values for the parameter of interest are known.
This additive multivariate calibration may be considered as a general interference subtraction, whereby each input spectrum is resolved as a sum of underlying structures, each with a known or estimated spectrum. The structures can be known or directly measured spectra of various individual phenomena affecting the input spectrum, or estimated "loading" vectors (e.g. bilinear factors) that span their variability statistically. The resolution yields estimates of the level or score of each such phenomenon or factor in the input spectrum. Then the additive modeling performs the equivalent of a weighted subtraction of the various interferants' spectral effects, thereby providing selectivity enhancement.
Additive multivariate approximation is appropriate for purely additive structures, or by taking the logarithm of the data values, for purely multiplicative structures. The modeling is much less accurate and robust for mixed additive and multiplicative structures. Unfortunately, real measured spectral data usually has some degree of such mixed structures including multiplicative effects that affect the analytical sensitivity. In diffuse reflectance spectroscopy, for example, the scatter coefficient varies due to particle size. Even when grinding is appropriate, the resulting particles have a range of sizes, with a mean and distribution that is variable depending on both physical and chemical factors, and a range of optical properties that vary with the wavelength itself as well as the particle composition and physical shape. In transmission spectroscopy, the effective optical pathlength may be affected by changes in geometry, scattering, temperature, density of the material, and related physical parameters. Variation in the amount of material added to the column produces multiplicative effects in chromatography as does the intensity of the dye added to gel in electrophoresis. In image analysis, variations the total area of pixels counted and the pixel intensity can contribute multiplicative factors to otherwise additive structures. Finally, instrumental and other experimental effects, e.g. a nonlinear instrument response, may appear as multiplicative factors, particularly when a logarithmic data transformation is applied.
Much of the effort to overcome these effects has resulted from the increased use of near-infrared diffuse reflectance spectroscopy, in Which multiplicative effects are quite large although not necessarily obvious on first examination of spectral data. Near-infrared spectra tend to decrease in absorbance with decreasing wavelength because the absorption bands are based on several orders of overtones and combinations of mid-infrared vibrational frequencies. Band strength decreases as the order of the harmonic involved increases, i.e. as the frequency increases. A multiplicative effect on such a tilted spectrum appears similar to the addition of a tilted baseline. Therefore, Norris and other early workers used the first or second derivative of the absorbance spectrum with respect to wavelength in their models. The derivatives explicitly remove any additive constant and, in the case of the second derivative, any linear sloped additive baseline. Unfortunately, the true multiplicative effects remain after taking the derivative of the data.
Removal of a multiplicative factor implies division of the data by an appropriate value. Norris (K. H. Norris and P. C. Williams, Optimization of Mathematical Treatments of Raw Near-Infrared Signal in the Measurement of Protein in Hard Red Spring Wheat. I. Influence of Particle Size, Cereal Chem. 61(2):158 and K. H. Norris, Extracting Information from Spectrophotometric Curves Predicting Chemical Composition from Visible and Near-Infrared Spectra, Food Research and Data Analysis, H. Martens and H. Russwurm, Ed. Applied Science Publishers, Ltd. 1983 Essex, England, copies of which being annexed hereto) introduced the use of derivative ratios by 1981. In their approach, first or second derivative spectra are used so that any baseline offsets are eliminated. The absence of baseline offset in the divisor is a requirement to maintain linearity when removing a multiplicative factor. Their method involves selecting a wavelength for the first numerator by examination of the correlation of the data at each wavelength with the values of the parameter of interest. A denominator wavelength is then selected by similar examination of the correlation of the ratio to the parameter of interest. Iteration involving changes to the data point spacing and smoothing used in the finite difference computation of the derivative is then performed to optimize the approximation. Additional terms may then be added to the model in a stepwise procedure. This method has been useful however, it is limited to a specific calibration using data at a few selected wavelengths.
Murray and Jessiman (I. Murray and C. S. Jessiman, unpublished work (1982) quoted in Animal Feed Evaluation by Near Infrared Reflectance (NIR) Spectrocomputer paper presented at the Royal Society of Chemistry Symposium at the University of East Anglia, Norwich UK 23 Mar. 1982. A copy of which being annexed hereto) developed a technique termed "mathematical ball milling" which provided a correction to the whole spectrum. In their technique, simple linear least squares regression (estimation of a multiplicative slope and additive offset parameter) is used to find the best linear fit of each spectrum, as well as of the average of many spectra, (ordinates or regressands) to a vector representing the actual wavelength, e.g., nanometers (common abscissa or regressor). Each individual spectrum is then modified with respect to offset and slope such that the simple linear regression line of the modified spectrum is coincident with the regression line initially obtained for the average spectrum.
Martens, Jensen and Geladi (H. Martens, S. A. Jensen, and P. Geladi, Multivariate Linearity Transformations for Near-Infrared Reflectance Spectrometry, Proceedings, Nordic Symposium on Applied Statistics, Stavanger, June 1983, Stokkand Forlag Publishers, Stavanger, Norway pp.205-234, a copy of which being annexed hereto) developed the method of "Multiplicative Scatter Correction" that is the forerunner of the present invention. They utilize a previously known reference spectrum representative of the "ideal specimen". In practice this is usually based on the average of the spectra contained in the calibration data set. Each spectrum, whether used for calibration, validation, or determination, is then projected on this average spectrum by simple linear regression over selected wavelengths and its offset and slope relative to the average spectrum thereby determined. Corrected spectra are then obtained by subtracting the appropriate offset coefficient from each spectrum and then dividing the resulting spectral data by the slope coefficient. The estimated slope coefficient is sometimes modified somewhat at different wavelengths in order to correct for wavelength dependency of the scatter coefficient. The resulting corrected spectral values equal the average spectral values plus residuals that contain the desired analytical information normalized to the average measurement conditions. This method, however, is subject to errors caused by the non-random nature and potentially large magnitude of these residuals.
A prior approach to minimizing these errors has been to omit those portions of the spectrum having large variability from the data used in the regression. This approach is sometimes difficult to apply, because it may require many trials and operator judgments, and it is only partially successful at best. In a variation of this approach, the range of the spectral data included in the average spectrum used to determine the offset and slope coefficients is restricted to the vicinity of a strong isolated spectral feature, such as a solvent absorption band, thereby limiting the magnitude of the residuals and improving the accuracy of the correction. This variation has been applied to correction of the effects of scattering within the specimen in transmission spectroscopy. In many cases, however, there is no strong isolated band available for determination of the multiplicative correction. A related problem occurs in measuring one material through another with the pathlength through each material unknown and variable. In either case, better means are needed to accurately separate additive and multiplicative effects.
Varying levels of known or unknown additive interferences also characterize the above forms of spectral data. In spectroscopy it is common to have a background spectrum added to the desired data from sources such as absorption by the solvent used to dissolve the specimen for analysis, absorption by the reference used to determine the incident energy, nonspecific emission or fluorescence from the specimen or instrumentation, and stray light, specular reflections, and other measurement artifacts. The other technologies discussed above have similar problems of additive interferences.
Specimen stability is often a cause of such a problem. For example, in near-infrared diffuse reflectance spectroscopy powdered specimens are common. The water content of many powdered specimens tends to equilibrate with the environmental humidity. In many cases, it is extremely difficult to maintain an adequate set of calibration and validation specimens with a sufficient range of water content to allow accurate calibration. Temperature also affects the spectra, particularly in the case of hydrogen bonded species such as water. A small fraction of one degree Celsius temperature change can be readily detected in aqueous specimens. Adequate control of specimen temperature during measurement is difficult in the laboratory and often impossible in a processing plant environment. Other measurement technologies are subject to such difficult to control variables. A method to accurately remove the spectral effects of such variables without disturbing the analyte information prior to use of the spectra for calibration, validation, and determination would improve the utility and performance of multivariate data analysis techniques.
Manual subtraction of one or more background spectra from an unknown spectrum by graphically oriented trial-and-error is well known in several disciplines, e.g. in UV, VIS, and IR spectroscopy. This type of interference subtraction has the advantage of letting the user interactively apply his or her knowledge of the structures involved. Automated methods have been developed but these are subject to significant errors, particularly where not all the constituent spectra are known, where constituent spectra are influenced by changes in the environment, and Where the background or interference spectra are correlated with the analyte spectra.
In general, the above previous spectral correction techniques have been based on assumptions that the data structures are linear in the parameters. Various linearization techniques are applied to the data, most commonly the logarithmic transformation to convert purely multiplicative structures to additive form and, in diffuse reflectance spectroscopy, the Kubelka-Munk function. While useful, these data transformations are based on the assumption that the structure is intrinsically linear. Physical and instrumental effects often add intrinsically nonlinear elements to measured data structures, even if the underlying phenomena is linear.