The infrared (12500-400 cm.sup.-1) spectrum of a substance contains absorption features due to the molecular vibrations of the constituent molecules. The absorptions arise from both fundamentals (single quantum transitions occurring in the mid-infrared region from 4000-400 cm.sup.-1) and combination bands and overtones (multiple quanta transitions occurring in the mid- and the near-infrared region from 12500-4000 cm.sup.-1). The position (frequency or wavelength) of these absorptions contain information as to the types of molecular structures that are present in the material, and the intensity of the absorptions contains information about the amounts of the molecular types that are present. To use the information in the spectra for the purpose of identifying and quantifying either components or properties requires that a calibration be performed to establish the relationship between the absorbances and the component or property that is to be estimated. For complex mixtures, where considerable overlap between the absorptions of individual constituents occurs, such calibrations must be accomplished using multivariate data analysis methods.
In complex mixtures, each constituent generally gives rise to multiple absorption features corresponding to different vibrational motions. The intensities of these absorptions will all vary together in a linear fashion as the concentration of the constituent varies. Such features are said to have intensities which are correlated in the frequency (or wavelength) domain. This correlation allows these absorptions to be mathematically distinguished from random spectral measurement noise which shows no such correlation. The linear algebra computations which separate the correlated absorbance signals from the spectral noise form the basis for techniques such as Principal Components Regression (PCR) and Partial Least Squares (PLS). As is well known in the art, PCR is essentially the analytical mathematical procedure of Principal Components Analysis (PCA) followed by regression analysis. Reference is directed to "An Introduction to Multivariate Calibration and Analysis", Analytical Chemistry, Vol. 59, No. 17, Sep. 1, 1987, pages 1007 to 1017, for an introduction to multiple linear regression (MLR), PCR and PLS.
PCR and PLS have been used to estimate elemental and chemical compositions and to a lesser extent physical or thermodynamic properties of solids and liquids based on their mid-or near-infrared spectra. These methods involve: [1] the collection of mid- or near-infrared spectra of a set of representative samples; [2] mathematical treatment of the spectral data to extract the Principal Components or latent variables (e.g. the correlated absorbance signals described above); and [3] regression of these spectral variables against composition and/or property data to build a multivariate model. The analysis of new samples then involves the collection of their spectra, the decomposition of the spectra in terms of the spectral variables, and the application of the regression equation to calculate the composition/properties.
Providing the components of the sample under test are included in the calibration samples used to build the predictive model, then, within the limits of the inherent accuracy of the predictions obtainable from the model, an accurate estimate of the property and/or composition data of the test sample will be obtained from its measured spectrum. However, if one or more of the components of the test sample are not included in the calibration samples on which the model is based, then prediction of the property and/or composition data will be inaccurate, because the predictive model produces a "best fit" of the calibration data to the test sample where some of the calibration data is inappropriate for that test sample. The present invention addresses, and seeks to overcome, this problem.