1. Field of the Invention
The present invention relates to a method of determining a weighted regression model and a method of predicting a component concentration of a mixture using the weighted regression model. More particularly, the present invention relates to a method of determining a leverage weighted regression model and a method of predicting a component concentration of a mixture using the leverage weighted regression model.
2. Description of the Related Art
Conventionally, a linear regression method is used to predict concentrations of specific components dissolved in a mixture from a spectrum of the mixture. This method is different from a method of measuring the concentrations using reagents reacting with the components to be measured, and can be applied to non-reagent measurement of blood components or non-invasive measurement of blood components.
A simple regression model, which is used to predict dependent variables from independent variables, is generally based on the following assumptions. First, it is assumed that the relationship between an independent variable, x, e.g., a measured spectrum data, and a dependent variable, y, , e.g., a component concentration to be measured, can be expressed by the following equation (1), in which μy.x indicates an expected value y at a given value of x, as expressed in equation (1):μy.x=β0+β1xi  (1),where xi is a value of an i-th measured x, and β0 and β1 are regression coefficients of a population, respectively.
Second, it is assumed that y has a normal distribution curve at a given value of x, and an average value of y varies with a variation of x, but a variance value of y is constant regardless of the variation of x.
Under these assumptions, the simple regression model may be expressed by the following equation (2):yi=β0+β1xi+εi  (2),where yi is the i-th estimated value, εi is an error term of the i-th measured y.
In the second assumption, the equivalent variance of the error term εi is assumed regardless of a variation of y, i.e., Var(εi)=σ2, where σ is variance. However, such an assumption of equivalent variance may not be established, and particularly, in a reference measurement device for a training set for obtaining a regression vector as the value of y increases, the greater the increase in the measurement error. Further, the least square method usually used for forming a regression model may be greatly influenced by abnormal measured values, thereby causing distortion of the regression model by a small number of outliers, which may also cause a normal measured value to be judged as an outlier. FIGS. 1A and 1B are graphs illustrating predicted values using a regression model, where FIG. 1A shows a normal regression model 2 in accordance with normal measured data 1, and FIG. 1B shows a distorted regression model 4 due to outliers 3.
Generally, a training set is required for forming a regression model, and all observation points belonging to the training set equally contribute to the regression model. However, it cannot be said that the quantity and quality of data required for forming the regression model should be equally given to all observation points. For example, since an error of the reference measurement device decreases when y decreases, it can be said that an observation point having a smaller y is more reliable than an observation point having a larger y. Further, because they are highly likely to have wrong data, observation points having independent variable x as an outlier should have relatively less importance in the regression equation than the other observation points.