In the control of complex technical systems it is often desirable that the action to be carried out on the technical systems is selected such that an advantageous desired dynamic behavior of the technical system is obtained. The dynamic behavior can however often not be predicted in a simple manner in complex technical systems, so that appropriate computer-aided prediction methods are needed in order to estimate the future behavior of the technical system and to select suitable actions for regulation or control of the technical system accordingly.
Nowadays the control of technical systems is often based on expert knowledge, i.e. the automatic regulation of the system is created on the basis that this expert knowledge. However approaches are also known in which technical systems are controlled with the aid of known methods of what is referred to as reinforcement learning, see document [2]. The known methods are however not generally applicable to any given technical systems and often do not supply sufficiently good results.