In the control of complex technical systems it is often desirable that changes to be made on the technical system are selected such that an advantageous desirable dynamic behavior of the technical system is obtained. The dynamic behavior can however often not be predicted in a simple manner in complex technical systems, so that appropriate computer-aided prediction methods are needed in order to estimate the future behavior of the technical system and to select suitable actions for regulation or control of the technical system accordingly.
Often the states of a technical system are not simple to measure and can only be described statistically on the basis of stochastic components of the system behavior. Thus, in the regulation of such technical systems, often no setpoint values or guide variables exist for a corresponding regulation as well as no corresponding target values on the mapping of which an appropriate simulation model, such as the neural network for example, can be trained. Since possible dependencies between different measured values and setpoint values of complex technical systems are often unclear, it is only possible with difficulty or by extremely long periods of observation of the dynamic behavior of the technical system to develop an optimum automatic regulation for this system.
Different methods for regulation or control and for optimization of operating points of technical systems are known from the prior art: These methods either use an analytical model for description of the technical system or they are based on preceding measurement data of the technical system or on a modeling of the technical system based on knowledge about to the system, with the modeling being undertaken example with the aid of Bayesian networks or neuro fuzzy networks.
The known methods of regulation or control of a technical system have the disadvantage that the methods for modeling the technical system often need a plurality of measurement data and also that it is not clear in advance whether the methods are suitable for the specific technical system used.
Publication EP 1 016 981 A1 shows a an apparatus for learning of an agent, with actions to be carried out on a technical system being learned with a plurality of learning modules based on reinforcement learning. Depending on prediction errors determined the actions of the individual learning modules are weighted and combined with each other accordingly.
A method for control of a technical system is known from document U.S. Pat. No. 5,485,545 A, in which the control based on recurrent neural networks is learned. Control of the voltage of a power supply system is described as a practical application.