The present invention relates generally to methods for multivariate calibration and prediction and their application to the non-invasive or non-destructive measurement of selected properties utilizing spectroscopy methods. A specific implementation of the invention relates to the situation where the multivariate calibration and prediction methods are utilized in a situation wherein biological tissue is irradiated with infrared energy having at least several wavelengths and differential absorption by the biological tissue sample is measured to determine an analyte concentration or other attribute of the tissue by application of the calibration model to the resulting spectral information.
The need and demand for an accurate, non-invasive method for determining attributes of tissue, other biological samples or analyte concentrations in tissue or blood are well documented. For example, accurate non-invasive measurement of blood glucose levels in patients, particularly diabetics, would greatly improve treatment. Barnes et al. (U.S. Pat. No. 5,379,764) disclose the necessity for diabetics to frequently monitor glucose levels in their blood. It is further recognized that the more frequent the analysis, the less likely there will be large swings in glucose levels. These large swings are associated with the symptoms:and complications of the disease, whose long-tern effects can include-heart disease, arteriosclerosis, blindness, stroke, hypertension, kidney failure, and premature death. As described below, several systems have been proposed for the non-invasive measurement of glucose in blood. However, despite these efforts, a lancet cut into the finger is still necessary for all presently commercially available forms of home glucose monitoring. This is believed so compromising to the diabetic patient that the most effective use of any form of diabetic management is rarely; achieved.
The various proposed non-invasive methods for determining blood glucose level generally utilize quantitative infrared spectroscopy as a theoretical basis for analysis. In general, these methods involve probing glucose containing tissue using infrared radiation in absorption or attenuated total reflectance mode. Infrared spectroscopy measures the electromagnetic radiation (0.7-25 xcexcm) a substance absorbs at various wavelengths. Molecules do not maintain fixed positions with respect to each other, but vibrate back and forth about an average distance. Absorption of light at the appropriate energy causes the molecules to become excited to a higher vibration level. The excitation of the molecules to an excited state occurs only at certain discrete energy levels, which are characteristic for that particular molecule. The most primary vibrational states occur in the mid-infrared frequency region (i.e., 2.5-25 xcexcm). However, non-invasive analyte determination in blood in this region is problematic, if not impossible, due to the absorption of the light by water. The problem is overcome through the use of shorter wavelengths of light which are not as attenuated by water. Overtones of the primary vibrational states exist at shorter wavelengths and enable quantitative determinations at these wavelengths.
It is known that glucose absorbs at multiple frequencies in both the mid- and near-infrared range. There are, however, other infrared active analytes in the tissue and blood that also absorb at similar frequencies. Due to the overlapping nature of these absorption bands, no single or specific frequency can be used for reliable non-invasive glucose measurement. Analysis of spectral data for glucose measurement thus requires evaluation of many spectral intensities over a wide spectral range to achieve the sensitivity, precision, accuracy, and reliability necessary for quantitative determination. In addition to overlapping absorption bands, measurement of glucose is further complicated by the fact that glucose is a minor component by weight in blood and tissue, and that the resulting spectral data may exhibit a non-linear response due to both the properties of the substance being examined and/or inherent non-linearities in optical instrumentation.
A further common element to non-invasive glucose measuring techniques is the necessity for an optical interface between the body portion at the point of measurement and the sensor element of the analytical instrument. Generally, the sensor element must include an input element or means for irradiating the sample point with the infrared energy. The sensor element must further include an output element or means for measuring transmitted or reflected energy at various wavelengths resulting from irradiation through the input element. The optical interface also introduces variability into the non-invasive measurement.
Robinson et al. (U.S. Pat. No. 4,975,581) disclose a method and apparatus for measuring a characteristic of unknown value in a biological sample using infrared spectroscopy in conjunction with a multivariate model that is empirically derived from a set of spectra of biological samples of known characteristic values. The above-mentioned characteristic is generally the concentration of an analyte, such as glucose, but also may be any chemical or physical property of the sample. The method of Robinson et al. involves a two-step process that includes both calibration and prediction steps. In the calibration step, the infrared light is coupled to calibration samples of known characteristic values so that there is differential attenuation of at least several wavelengths of the infrared radiation as a function of the various components and analytes comprising the sample with known characteristic value. The infrared light is coupled to the sample by passing the light through the sample or by reflecting the light from the sample. Absorption of the infrared light by the sample causes intensity variations of the light that are a function of the wavelength of the light. The resulting intensity variations at the at least several wavelengths are measured for the set of calibration samples of known characteristic values. Original or transformed intensity variations are then empirically related to the known characteristic of the calibration samples using a multivariate algorithm to obtain a multivariate calibration model. In the prediction step, the infrared light is coupled to a sample of unknown characteristic value, and the calibration model is applied to the original or transformed intensity variations of the appropriate wavelengths of light measured from this unknown sample. The result of the prediction step is the estimated value of the characteristic of the unknown sample. The disclosure of Robinson et al. is incorporated herein by reference.
Barnes et al. (U.S. Pat. No. 5,379,764) disclose a spectrographic method for analyzing glucose concentration wherein near infrared radiation is projected on a portion of the body, the radiation including a plurality of wavelengths, followed by sensing the resulting radiation emitted from the portion of the body as affected by the absorption of the body. The method disclosed includes pretreating the resulting data to minimize influences of offset and drift to obtain an expression of the magnitude of the sensed radiation as modified.
Dxc3xa4hne et al. (U.S. Pat. No. 4,655,225) disclose the employment of near infrared spectroscopy for non-invasively transmitting optical energy in the near infrared spectrum through a finger or earlobe of a subject. Also discussed is the use of near infrared energy diffusely reflected from deep within the tissues. Responses are derived at two different wavelengths to quantify glucose in the subject. One of the wavelengths is used to determine background absorption, while the other wavelength is used to determine glucose absorption.
Caro (U.S. Pat. No. 5,348,003) discloses the use of temporally modulated electromagnetic energy at multiple wavelengths as the irradiating light energy. The derived wavelength dependence of the optical absorption per unit path length is compared with a calibration model to derive concentrations of an analyte in the medium.
Wu et al. (U.S. Pat. No. 5,452,723) disclose a method of spectrographic analysis of a tissue sample which includes measuring the diffuse reflectance spectrum, as well as a second selected spectrum, such as fluorescence, and adjusting the spectrum with the reflectance spectrum. Wu et al. assert that this procedure reduces the sample-to-sample variability.
The intended benefit of using models such as those disclosed above, including multivariate analysis as disclosed by Robinson, is that direct measurements that are important but costly, time consuming, or difficult to obtain, may be replaced by other indirect measurements that are cheaper and easier to get. However, none of the prior art modeling methods, as disclosed, has proven to be sufficiently robust or accurate to be used as a surrogate or replacement for direct measurement of an analyte such as glucose.
Of particular importance to the present invention is the use of multivariate analysis. Measurement by multivariate analysis involves a two-step process. In the first step, calibration, a model is constructed utilizing a dataset obtained by concurrently making indirect measurements and direct measurements (e.g., by invasively drawing or taking and analyzing a biological sample such as blood for glucose levels) in a number of situations spanning a variety of physiological and instrumental conditions. A general form for the relationship between direct (blood-glucose concentration) and the indirect (optical) measurements is Ĝ=ƒ(y1, y2, . . . , yq), where Ĝ is the desired estimated value of the direct measurement (glucose), ƒ is some function (model), and y1, y2, . . . , yq (the arguments of ƒ) represents the indirect (optical) measurement, or transformed optical measurements, at q wavelengths. The goal of this first step is to develop a useful function, ƒ. In the second step, prediction, this function is evaluated at a measured set of indirect (optical) measurements {y1, y2, . . . , yq} in order to obtain an estimate of the direct measurement (blood-glucose concentration) at some time in the future when optical measurements will be made without a corresponding direct or invasive measurement.
Ideally, one would prefer to develop a calibration model that is applicable across all subjects. Many such systems have been proposed as discussed above. However, it has been shown that for many applications the variability of the items being measured makes it difficult to develop such a universal calibration model. For the glucose application, the variability is across subjects with respect to the optical appearance of tissue and, possibly, across the analyte within the tissue.
FIG. 1 indicates the levels of spectral variation observed both among and within subjects during an experiment in which 84 measurements were obtained from each of 8 subjects. Sources of spectral variation within a subject include: spatial effects across the tissue, physiological changes within the tissue during the course of the experiment, sampling effects related to the interaction between the instrument and the tissue, and instrumental/environmental effects. The spectral variation across subjects is substantially larger than the sum of all effects w within a subject. In this case the subjects were from a relatively homogeneous population. In the broader population it is expected that spectral variation across subjects will be substantially increased. Thus, the task of building a universal calibration model is a daunting one.
In order to avoid the issue of variability across subjects, one approach involves building a completely new model for each subject. Such a method involves a substantial period of observation for each subject, as taught by R. Marbach et al., xe2x80x9cNoninvasive Blood Glucose Assay by Near-Infrared Diffuse Reflectance Spectroscopy of the Human Inner Lip,xe2x80x9d Applied Spectroscopy, 1993, 47, 875-881. This method would be inefficient and impractical for commercial glucose applications due to the intensive optical sampling that would be needed for each subject.
Another approach taught by K. Ward et al., xe2x80x9cPost-Prandial Blood Glucose Determination by Quantitative Mid-Infrared Spectroscopy,xe2x80x9d Applied Spectroscopy, 1992, 46, 959-965, utilizes partial least-squares multivariate calibration models based on whole blood glucose levels. When the models were based on in vitro measurements using whole blood, a subject-dependent concentration bias was retrospectively observed, indicating that additional calibration would be necessary.
In an article by Haaland et al., xe2x80x9cReagentless Near-Infrared Determination of Glucose in Whole Blood Using Multivariate Calibration,xe2x80x9d Applied Spectroscopy, 1992, 46, 1575-1578, the authors suggest the use of derivative spectra for reducing subject-to-subject (or inter-subject) spectral differences. The method was not found to be effective on the data presented in the paper. First derivatives are an example of a general set of processing methods that are commonly used for spectral pretreatment. A general but incomplete list of these pretreatment methods would include trimming, wavelength selection, centering, scaling, normalization, taking first or higher derivatives, smoothing, Fourier transforming, principle component selection, linearization, and transformation. This general class of processing methods has been examined by the inventors and has not been found to effectively reduce the spectral variance to the level desired for clinical prediction results.
In an article by Lorber et al., xe2x80x9cLocal Centering in Multivariate Calibration,xe2x80x9d Journal of Chernometrics, 1996, 10, 215-220, a method of local centering the calibration data by using a single spectrum is described. For each unknown sample, the spectrum used for centering the calibration data set is selected to be that spectrum that is the closest match (with respect to Mahalanobis distance) to the spectrum of the unknown. A separate partial least-squares model is then constructed for each unknown. The method does not reduce the overall spectroscopic variation in the calibration data set.
Accordingly, the need exists for a method and apparatus for non-invasively measuring attributes of biological tissue, such as glucose concentrations in blood, which incorporates a model that is sufficiently robust to act as an accurate surrogate for direct measurement. The model would preferably account for variability both between subjects and within the subject on which the indirect measurement is being used as a predictor. In order to be commercially successful, applicants believe, the model should not require extensive sampling of the specific subject on which the model is to be applied in order to accurately predict a biological attribute such as glucose. Extensive calibration of each subject is currently being proposed by BioControl Inc. In a recent press release the company defines a 60-day calibration procedure followed by a 30-day evaluation period.
The present invention addresses these needs as well as other problems associated with existing models and calibrations used in methods for non-invasively measuring an attribute of a biological sample such as glucose concentration in blood. The present invention also offers further advantages over the prior art and solves problems associated therewith.
The present invention is a method that reduces the level of interfering spectral variation that a multivariate calibration model needs to compensate for. An important application of the invention is the non-invasive measurement of an attribute of a biological sample such as an analyte, particularly glucose, in human tissue. The invention utilizes spectroscopic techniques in conjunction with improved protocols and methods for acquiring and processing spectral data. The essence of the invention consists of protocols and data-analytic methods that enable a clear definition of intra-subject spectral effects while reducing inter-subject spectral effects. The resulting data, which have reduced inter-subject spectroscopic variation, can be utilized in a prediction method that is specific for a given subject or tailored (or-adapted) for use on the specific subject. The prediction method uses a minimal set of reference samples from that subject for generation of valid prediction results.
A preferred method for non-invasively measuring a tissue attribute, such as the concentration of glucose in blood, includes first providing an apparatus for measuring infrared absorption by a biological sample such as an analyte containing tissue. The apparatus preferably includes generally three elements, an energy source, a sensor element, and a spectrum analyzer. The sensor element includes an input element and an output element. The input element is operatively connected to the energy source by a first means for transmitting infrared energy. The output element is operatively connected to the spectrum analyzer by a second means for transmitting infrared energy.
In practicing a preferred method of the present invention, an analyte containing tissue area is selected as the point of analysis. This area can include the skin surface on the finger, earlobe, forearm, or any other skin surface. A preferred sample location is the underside of the forearm. The sensor element, which includes the input element and the output element, is then placed in contact with the skin. In this way, the input element and output element are coupled to the analyte containing tissue or skin surface
In analyzing for a biological attribute, such as the concentration of glucose in the analyte containing tissue, light energy from the energy source is transmitted via a first means for transmitting infrared energy into the input element. The light energy is transmitted from the input element to the skin surface. Some of the light energy contacting the analyte-containing sample is differentially absorbed by the various components and analytes contained therein at various depths within the sample. A quantity of light energy is reflected back to the output element. The non-absorbed reflected light energy is then transmitted via the second means for transmitting infrared energy to the spectrum analyzer. As detailed below, the spectrum analyzer preferably utilizes a computer and associated memory to generate a prediction result utilizing the measured intensities and a calibration model from which a multivariate algorithm is derived.
The viability of the present invention to act as an accurate and robust surrogate for direct measurement of biological attributes in a sample such as glucose in tissue, resides in the ability to generate accurate predictions of the direct measurement (e.g., glucose level) via the indirect measurements (spectra). Applicants have found that, in the case of the noninvasive prediction of glucose by spectroscopic means, application of known multivariate techniques to spectral data, will not produce a predictive model that yields sufficiently accurate predictions for future use. In order to obtain-useful predictions, the spectral contribution from the particular analyte or attribute of interest must be extracted from a complex and varying background of interfering signals. The interfering signals vary across and within subjects and can be broadly partitioned into xe2x80x9cintra-subjectxe2x80x9d and xe2x80x9cinter-subjectxe2x80x9d sources. Some of these interfering signals arise from other substances that vary in concentration. The net effect of the cumulative interfering signals is such that the application of known multivariate analysis methods does not generate prediction results with an accuracy that satisfies clinical needs.
The present invention involves a prediction process that reduces the impact of subject-specific effects on prediction through a tailoring process, while concurrently facilitating the modeling of intra-subject effects. The tailoring process is used to adapt the model so that it predicts accurately for a given subject. An essential experimental observation is that intra-subject spectral effects are consistent across subjects. Thus, intra-subject spectral variation observed from a set of subjects can be used to enhance or strengthen the calibration for subsequent use on an individual not included in the set. This results in a prediction process that is specific for use on a given subject, but where intra-subject information from other subjects is used to enhance the performance of the monitoring device.
Spectroscopic data that have been acquired and processed in a manner that reduces inter-subject spectroscopic variation while maintaining intra-subject variation are herein referred to as generic calibration data. These generic data, which comprise a library of intra-subject variation, are representative of the likely variation that might be observed over time for any particular subject. In order to be effective, the intra-subject spectral variation manifested in the generic calibration data must be representative of future intra-subject spectral effects such as those effects due to physiological variation, changes in the instrument status, sampling techniques, and spectroscopic effects associated with the analyte of interest. Thus, it is important to use an appropriate experimental protocol to provide representation of these intra-subject spectral effects.
In each prediction embodiment of the present invention, multivariate techniques are applied to the generic calibration data to derive a subject-specific predictor of the direct measurement. Each prediction embodiment uses the generic calibration data in some raw or altered condition in conjunction with at most a few reference spectra from a specific subject to achieve a tailored prediction method that is an accurate predictor of a desired indirect measurement for that particular subject. Reference spectra are spectroscopic measurements from a specific subject that are used in the development of a tailored prediction model. Reference analyte values quantify the concentration of the analyte (via direct methods) and can be used in the development of a tailored prediction model. Applicants have developed several embodiments that incorporate the above concepts.
Each tailored prediction method described herein utilizes generic calibration data. Generic calibration data can be created by a variety of data acquisition and processing methods. In a first preferred processing method, the generic calibration data are obtained by acquiring a series of indirect measurements from one or more subjects and a direct measurement for each subject corresponding to each indirect measurement. An appropriate experimental protocol is needed to provide adequate representation of intra-subject effects that are expected in the future (including those associated with the analyte of interest). The mean indirect measurement and the mean direct measurement for each subject based on the number of measurements from that subject are then formed. The indirect measurements are mean centered by subtracting the mean indirect measurement of each subject from each of that subject""s indirect measurements. The direct measurements are mean centered by subtracting the mean direct measurement of each subject from each of that subject""s direct measurements. That is, the subject-specific mean indirect measurements and subject-specific mean direct measurements act as subject-specific subtrahends. The sets of mean-centered measurements (indirect and direct) comprise the generic calibration data.
There are a number of other related ways for creating generic calibration data with a subject-specific subtrahend. For example, the subject-specific subtrahends for the indirect and direct measurements could be some linear combination of each subject""s indirect and direct measurements, respectively.
In one other specific method for creating generic calibration data, the subject-specific subtrahends for the indirect and direct measurements consist of the mean of the first S indirect measurements of each subject and the mean of the first S direct measurements of each subject, respectively. Alternately, a moving window reference technique could be utilized wherein the subtrahends are the subject-specific means of the S nearest (in time) indirect and direct measurements, where S is less than the total number of reference measurements made on a particular subject. The value of S can be chosen to fit the constraints of the particular application, neglecting effects due to random noise and reference error.
In another alternative processing, method, the generic calibration data can be produced in a round-robin reference manner wherein you subtract each of the patient""s reference data from every other reference measurement made on that subject in a round-robin fashion.
In a further alternative processing method which is particularly useful when a spectral library associated with a large number of subjects exists, the generic calibration data are created by subtracting some linear combination of spectral library data in order to minimize inter-subject spectral features. Subject-specific attributes can be reduced by subtracting some linear combination of similar spectra. That is, the subject-specific subtrahend for a given subject consists of a linear combination of spectra obtained from one or more subjects each of whom are different than the given subject. In one embodiment, the spectrum of a given subject would be matched with a combination of similarly appearing spectra from other subjects. In another embodiment, one would match the spectrum of a given subject with a combination of spectra from other subjects where the matching criteria involve measurable parameters such as age, gender, skin thickness, etc.
In a final alternative processing method, the generic calibration data are created through simulation in a manner that minimizes subject-specific spectral attributes. This methodology requires accurate simulations of patient spectra, as well as accurate modeling of the optical system, the sampler-tissue interface, and the tissue optical properties which all contribute to such spectral variation. Generic calibration data can be simulated directly or subject data can be simulated. The simulated subject spectra can subsequently be processed by any of the preceding five processing methods. In an additional embodiment, the simulated data can be combined with real patient data for the creation of a hybrid generic calibration data.
Once the generic calibration data have been created, such data is then utilized to create a tailored prediction process specific for a particular subject for use in future predictions of the biological attribute. The tailored prediction process can be accomplished in several ways.
The most straightforward and direct way to tailor the prediction process to a given subject is as follows and will be denoted as direct tailoring. First, the generic calibration data are used to develop an intra-subject calibration model for the analyte of interest. This model herein is referred to as a generic model. By design, the generic model will produce predictions that are essentially unaffected by intra-subject spectral variation that is represented in the generic calibration data and not associated with the analyte of interest. On the other hand, the generic model will produce predictions that are appropriately sensitive to the analyte of interest. The generic model is applied directly to at least one indirect measurement from a target subject for whom there are corresponding direct measurements. The resulting predictions of the generic model are averaged. The difference between the average of the direct measurements and average prediction is computed. This subject-specific difference is added to the subsequent predictions of the generic model as applied directly to the future indirect measurements from the target subject. The resultant sums comprise the net predictions of the direct measurement corresponding to the future indirect measurements from the target subject. It is important to note that a single generic model can be used in the tailoring process for a number of target subjects.
A second tailored prediction embodiment uses a combination of at least two subject reference spectra, reference analyte values and the generic calibration data to create a prediction model that is specific for use on the particular subject. The technique by which the calibration data and reference spectra are combined uses a linear combination of the data in absorbance units. The combinations of calibration data and reference data can be done in a structured or random way. It is the applicant""s observation that random associations work effectively and are easily implemented, The process of creating these composite data is referred to as robustification. The resulting calibration spectra contain the reference spectra from the particular patient combined with spectral data that contains sources of spectroscopic variation associated with physiological variations, variations associated with sampling techniques, instrument variation and spectroscopic effects associated with the analyte of interest. The composite calibration data can be processed to develop a calibration model. The resulting model will be referred to hereafter as a composite calibration model. The resulting composite calibration model is specific for a particular patient and can be used to generate analyte prediction results for the particular subject.
In the use of either tailored prediction process, reference spectra and reference analyte values are utilized. The reference information is used in combination with the generic calibration data to create a tailored prediction process for use on the particular subject. In general terms the subject reference information is used to tailor a general processing method for use on a particular subject. In an additional embodiment, the subject reference spectra can be replaced by the use of a subject-matched spectrum or a set of matched spectra. Matched spectra are spectra from another subject or a combined spectrum that interacts with the calibration model in a manner similar to the subject to be predicted upon. In use, a never-before-seen subject is tested and at least one spectrum is obtained. The resulting spectrum is used for generating a prediction result and as a reference spectrum. In use and in contrast to the two prior embodiments no reference analyte value is used or needed. The implementation of this method requires the following:
1. Identification or creation of a matched spectra through use of the reference spectra.
2. Replacement of the reference spectra with the corresponding matched spectra.
3. Although reference analyte values are not obtained from the never-before-seen patient, matched analyte values from the corresponding matched spectra are used in the processing method in a manner consistent with the prior uses of reference analyte values.
4. Use of either tailored prediction process.
In practice, the spectral data from the never-before-seen subject is compared with spectral data that has corresponding biological attribute reference values in a spectral library to identify the best method or several matched spectra. Matched spectra are spectra from another subject that appear similar when processed by the calibration model. Applicants have observed that identical twins are well matched from a spectroscopic model perspective.
As stated previously, the application of known multivariate analysis techniques have not resulted in glucose prediction results at a clinically relevant level. The processing method described overcomes these known limitations by using a matched spectrum. Thus, the subject tailoring with this method is accomplished without an actual reference analyte value from the individual. The matched spectrum method in conjunction with either tailored prediction process requires a large spectral library to facilitate the appropriate matching between the subject to be predicted upon and at least one library spectrum. In implementation of this matching method, applicants have identified matched spectra by finding those spectra that are most consistent with the calibration model as reflected by such parameters as Mahalanobis distance and spectral residual metrics. Other methods of spectral match would also have applicability for determination of matched spectra.