Discriminant function analysis is used to determine which variables discriminate between two or more naturally occurring groups. Computationally, it is very similar to analysis of variance. The basic idea underlying discriminant function analysis is to determine whether groups differ with regard to the mean of a variable, and then to use that variable to predict group membership (e.g., of new cases). Stated in this manner, the discriminant function problem can be rephrased as a one-way analysis of variance (ANOVA) problem. Specifically, one can ask whether or not two or more groups are significantly different from each other with respect to the mean of a particular variable. If the means for a variable are significantly different in different groups, then we can say that this variable discriminates between the groups. In the case of a single variable, the final significance test of whether or not a variable discriminates between groups is the F test. F is essentially computed as the ratio of the between-groups variance in the data over the pooled (average) within-group variance. If the between-group variance is significantly larger then there must be significant differences between means.
Usually, one includes several variables in a study in order to see which one(s) contribute to the discrimination between groups. In that case, we have a matrix of total variances and co-variances; likewise, we have a matrix of pooled within-group variances and co-variances. We can compare those two matrices via multivariate F tests in order to determined whether or not there are any significant differences (with regard to all variables) between groups. This procedure is identical to multivariate analysis of variance or MANOVA As in MANOVA, one could first perform the multivariate test, and, if statistically significant, proceed to see which of the variables have significantly different means across the groups. Thus, even though the computations with multiple variables are more complex, the principal reasoning still applies, namely, that we are looking for variables that discriminate between groups, as evident in observed mean differences.
For a set of observations containing one or more quantitative variables and a classification variable defining groups of observations, the discrimination procedure develops a discriminant criterion to classify each observation into one of the groups. Post hoc predicting of what has happened in the past is not that difficult. It is not uncommon to obtain very good classification if one uses the same cases from which the discriminant criterion was computed. In order to get an idea of how well the current discriminant criterion “performs”, one must classify (a priori) different cases, that is, cases that were not used to estimate the discriminant criterion. Only the classification of new cases allows us to assess the predictive validity of the discriminant criterion. In order to validate the derived criterion, the classification can be applied to other data sets. The data set used to derive the discriminant criterion is called the training or calibration data set.
The discriminant criterion (function(s) or algorithm), is determined by a measure of generalized squared distance. It can be based on the pooled co-variance matrix yielding a linear function. Either Mahalanobis or Euclidean distance can be used to determine proximity.
For the development of a discriminant algorithm, data of a group of subjects of an observational liver fibrosis of study were analyzed. Liver fibrosis scoring systems under view were                the Scheuer Score (0–4),        the Modified Ishak Score (HAI) A—Interface Hepatitis (0–4),        the Modified Ishak Score (HAI) B—Confluent Necrosis (0–6),        the Modified Ishak Score (HAI) C—Spotty Necrosis (0–4),        the Modified Ishak Score (HAI) D—Portal Inflammation (0–4),        the Modified HAI Score (Ishak Score)(0–6).        
Applying a stepwise discriminant analysis, for example the following functions of serum parameters showed to have major impact on the corresponding scoring type.
Scoring TypeSurrogate ParametersScheurer Score:ln(TIMP -1)ln(Collagen VI/ln(Hyaluronan/Hyaluronan)Laminin)Modified Ishakln(TIMP-1)ln(Collagen VI/ln(Collagen VI/Score A -Hyaluronan)Tenascin)InterfaceHepatitis:Modified Ishakln(Hyaluronan)ln(Collagen VI/Score B -MMP-2)ConfluentNecrosis:Modified Ishakln(Hyaluronan)ln(MMP-9/TIMP-1/Score C -complex Tenascin)Spotty Necrosis:Modified Ishakln(Laminin)ln(Collagen VI/Score D -TIMP-1)PortalInflammation:Modified Ishakln(TIMP-1)ln(Collagen VI/ln(Hyaluronan/Score - Stage:Hyaluronan)Laminin)
A corresponding discriminant analysis yielded the linear discriminating functions which can be used for calculation and prediction of biopsy score. The algorithms can be applied to every known scoring system (e.g. Scheuer Score, Ishak Score, Netavir Score, Ludwig Score, HAI Score).
Algorithms can be used to predict the biopsy score of a patient (e.g. score 0, 1, 2, 3, . . . ) or to predict a group of scores (category) a patient belongs to (e.g. mild fibrosis; score 0 to 1).
Discriminating functions used includes combinations of markers from the list of N-terminal procollagen III propeptide (PIIINP), Collagen IV, Collagen VI, Tenascin, Laminin, Hyaluronan, MMP-2, TIMP-1 and MMP-9/TIMP-1 complex and also factors between −1000 and +1000.
Different scores need different algorithm form the list of markers and factors.