The present invention relates to automated forms of data processing, more particularly implementing forms of least-squares analysis, inversion-conforming data sets processing, and alternate forms of regression analysis and maximum likelihood estimating, to include the appropriate handling of linear and nonlinear data in correspondence with homogeneous and heterogeneous sample precision, with added provision for handling unquantifiable dependent variable representations and representing multivariate observations as related to two dimensional segment inversions.
As empirical relationships are often required to describe system behavior, data analysts continue to rely upon least-squares and maximum likelihood approximation methods to fit both linear and nonlinear functions to experimental data. Fundamental concepts, related to both maximum likelihood estimating and least-squares curve fitting, stem from the early practice referred to in 1766 by Euler as calculus of variation. The related concepts were developed in the mid 1700's, primarily through the efforts of Lagrange and Euler, utilizing operations of calculus for locating maximum and minimum function value correspondence. The maximum and minimum values and certain inflection points of the function occur at coordinates which correspond to points of zero slope along the curve. To determine the point where a minimum or maximum occurs, one derives an expression for the derivative (or slope) of the function and equates the expression to zero. By merely equating the derivative of the function to zero, local parameters which respectively establish the maximum or minimum function values can be determined.
The process of Least-Squares analysis utilizes a form of calculus of variation in statistical application to determine fitting parameters which establish a minimum value for the sum of squared single component residual deviations from a parametric fitting function. The process was first publicized in 1805 by Legendre. Actual invention of the least-squares method is clearly credited to Gauss, who as a teenage prodigy first developed and utilized it prior to his entrance into the University of Göttingen.
Maximum likelihood estimating is of somewhat more general application than that of least-squares analysis. It is traditionally based upon the concept of maximizing a likelihood which may be defined either as the product of discrete sample probabilities or, for the current analogy, as the product of measurement sample probability densities. By far, the most commonly considered form for representing a probability density function is referred to as the normal probability density distribution function (or Gaussian distribution). The respective Gaussian probability density function as formulated for a mean square deviation of <δY2> in the measurement of y will take the form of Equation 1:
                                          D            ⁢                                                  ⁢                          (                              Y                -                y                            )                                =                                    1                                                                    2                    ⁢                    π                                    <                                      δ                    Y                    2                                    >                                                      ⁢                          ⅇ                              -                                                                            (                                              Y                        -                        y                                            )                                        2                                                        2                    <                                          δ                      Y                      2                                        >                                                                                      ,                            (        1        )            wherein D represents a probability density, Y represents an observation or dependent variable measurement, and y represents the expected or true value for the dependent variable. The formula for the Gaussian distribution was apparently derived by Abraham de Moivre in about 1733. The distribution function is dubbed Gaussian Distribution due to extensive efforts of Gauss related to distributions of observable errors. Consistent with the concept of a probability density distribution function, the actual probability of occurrence is considered as the integral or sum of the probability density, taken (or summed) over a range of possible samples. A characteristic of probability distribution functions over all possible observations is that the area under the curve, considered between minus and plus infinity or over the restricted range of possible dependent variable measurements, will always be equal to unity. Thus, the probability of any arbitrary sample lying within the range of the distribution function entire is one, e.g.,
                                          ∫                          -              ∞                                      +              ∞                                ⁢                      D            ⁢                                                  ⁢                          (                              Y                -                y                            )                        ⁢                                                  ⁢                          ⅆ              Y                                      =        1.                            (        2        )            The probability of occurrence corresponding to any specific sample value, as considered in the limit as the range of integration approaches zero, would of course be zero.
In accordance with the present invention, for homogeneous uncertainty associated with a set of observations or homogeneous precision in the measurements of a variable, the precision and form of the probability density function associated with said set of observations or said measurements of said variable are independent of the coordinate location over the range and domain of the measurements. For line regression analysis, since lateral translations of a linear fitting function are indistinguishable from a respective change in the mean values of the dependent variable measurements, assuming normal error distributions, samples corresponding to any variety of coordinates along the fitting function can be independently included and combined in representing a likelihood estimator without consideration of affects that might be associated with lateral fitting function translations or fitting function distortions. By restricting the discussion in the following example to a concept of maximum likelihood with homogeneous precision and errors limited to the dependent variable, a typical linear Gaussian likelihood estimator can be represented for variations in the measurement of the dependent variable.