The present embodiments relate to computer-assisted analysis of a data record from observations.
In a number of areas of application, it is desirable to use a data record of observations to derive a connection between input variables and a target variable within the observations. In this case, the data record contains for each observation a data vector that includes the values of input variables and an assigned value of a target variable.
In the field of the regulation of technical systems, there is frequently a need to recognize the influence and/or the relevance of state variables of the technical system on and/or to a target variable of the technical system in order, for example, to learn on the basis thereof a suitable data-driven model that predicts the target variable as a function of relevant input variables. The regulation of the technical system may be suitably stipulated based on the prediction by the data-driven model. For example, the technical system may be a gas turbine with state variables that may include various temperatures, fuel amounts, fuel mixtures, positions of turbine blades and the like. For such a gas turbine, for example, the target variable may be the emission of nitrogen oxides or combustion chamber humming (e.g., increased vibrations in the combustion chamber). By suitable modeling of the gas turbine based on the input variables that have the greatest effect on the target variable, nitrogen oxide emissions and/or combustion chamber humming may be forecasted, and a high level of the nitrogen oxide emission and/or combustion chamber humming may thus be counteracted by suitably changing manipulative variables.
A further field of application is the analysis of production charges. In this case, each observation relates to corresponding parameters of the production of the production charge under consideration. The target variable corresponds to a quality parameter of the charge produced. The quality parameter may be represented, for example, by the number of failures of technical units produced for a charge within a time period after startup of the respective unit to the extent that the production charge refers to the fabrication of such a technical unit. By determining which production parameters have a particularly large influence on the quality of the production charge, the production processes may be analyzed, and the quality of the fabricated products may be improved by changing the input variables with a particularly large influence on the production.
There are known statistical tests that may be used to analyze a data record from observations with regard to the relevance of input variables to a target variable. However, the methods may not recognize nonlinear relationships and are not suitable for highly dimensional data vectors with a large number of input variables.