The service life of a rotor blade of a wind turbine is to a large extent dependent on the strength of the cyclical stresses acting on the rotor blade based e.g. on wind shears, turbulences and initial conditions. High stresses in the form of deflections of the rotor blades are herewith unwanted, particularly if states with a high stress alternate in real-time with states without stress or even with states with inverse stress (dynamic alternating stresses). For instance, alternating deflections due to excitation with a resonance frequency or on account of the alternating wind speeds as a function of the position of the rotor blade result in particularly significant weathering of the rotor blade.
A rotor blade of a wind turbine must typically withstand millions of stress cycles, which result in gradual wear of the rotor blade and reduce the remaining service life. It is herewith disadvantageous if the afore-cited particularly strong stresses drastically shorten the service life of the wind turbine.
It is known to change a pitch of the rotor blade of the wind turbine. This results in a change in the output produced by the wind turbine. In particular, the rotor blades of the wind turbine may be tilted individually. However, the imminent stress on the rotor blade is not known in advance and a tilting of the rotor blade, which has a weight of 10 tonnes or more for instance, requires energy and time. In addition, it is disadvantageous that such an adjustment of the rotor bade itself effects a considerable stress on and wear associated therewith of the bearing of the rotor blade and the tilting actuators, which in turn has a negative effect on the service life of the system and on the necessary maintenance of the wind turbine.
It is furthermore known to use one of the following methods as a learning algorithm for instance:                an NFQ method (“Neural Fitted Q Iteration”, see: M. Riedmiller: Neural Fitted Q Iteration—First Experiences with a Data Efficient Neural Reinforcement Learning Method. In Proc. of the European Conf. on Machine Learning, 2005),        an RCNN (“Recurrent Control Neural Network”, see: A.M. Schaefer, S. Udluft, and H.-G. Zimmermann. A Recurrent Control Neural Network for Data Efficient Reinforcement Learning. In Proc. of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007; or A. M. Schäfer, D. Schneegaβ, V. Sterzing, and S. Udluft. A Neural Reinforcement Learning Approach to Gas Turbine Control. International Joint Conference on Neural Networks, 2007) and/or        a PGNRR method (“Policy Gradient Neural Rewards Regression”, see: D. Schneegaβ, S. Udluft, and Th. Martinetz Improving Optimality of Neural Rewards Regression for Data-Efficient Batch Near-Optimal Policy Identification. In Proc. of the International Conf. on Artificial Neural Networks, 2007).        