When controlling complex technical systems it is often desirable to select the actions to be carried out on the technical system in such a manner that an advantageous desired dynamic behavior of the technical system is achieved. The dynamic behavior is however often not simple to predict in the case of complex technical systems, so corresponding computer-aided prediction methods are required, to estimate the future behavior of the technical system and to select appropriate actions for regulating or controlling the technical system correspondingly.
The control of technical systems today is frequently based on expert knowledge, in other words automatic regulation of the system is established on the basis of such expert knowledge. However approaches are also known, with which technical systems are controlled with the aid of known methods for what is known as reinforcement learning, see also document [2]. The known methods cannot however be applied generally to any technical systems and often do not furnish sufficiently good results.
A method for computer-aided control and/or regulation of a technical system is known from the German patent application DE 10 2007 001 025.9, in which an optimal action selection rule is learned in a computer-aided manner with the aid of neural networks. In this process the dynamic of the technical system is modeled with the aid of a recurrent neural network, which in turn is coupled to a further feed-forward network, which is used to learn the action selection rule.