The invention relates to a control system, to which a state vector representing the states of a controlled system is applied, and which provides a correcting variables vector of optimized correcting variables, the relation between the state vector and the correcting variables vector being defined by a matrix of weights, the weights being derived from an algorithmic solution of an optimization equation.
In control systems presently in use, all measured states of a controlled system are weighted and applied to all correcting variables. The states of the controlled system are combined in a state vector x. The correcting variables are combined in a correcting variables vector u. The weights are represented by a matrix. In order to achieve optimal control behavior, these weights have to be selected appropriately. The control system is optimized. The weights depend on the solution of an optimization equation such as the xe2x80x9cstate dependent Riccati equationxe2x80x9d (SDRE). This solution is represented by a matrix P(x) depending on the state vector.
In accordance with the prior art, the optimization equation, for example the state dependent Riccati equation, is computed off-line. From the solution P(x) for the time-dependent state vector, a vector of optimal correcting variables u(x) is computed. As x is time-dependent, also a time-dependent correcting variables vector results. A neural network is trained in a learning process, various state vectors x being applied to the neural network during the learning process. Then the neural network provides the associated correcting variables vectors. These correcting variables vectors u(t) are compared with the optimized correcting variables vectors uopt(t) which result from the off-line solution of, for example, the state dependent Riccati equation. The difference represents the learning signal for training the neural network. Then, the neural network thus trained represents the control system trained off-line, for example a guidance controller for missiles, which provides associated optimal correcting variables vectors u(t), when time-dependent state vectors x(t) are applied thereto.
Training of a neural network is cumbersome and requires the processing of large quantities of data. A control system obtained in this way is inflexible.
It is an object of the invention to improve a control system of the type defined in the beginning.
To this end, equation-solving means for algorithmically solving the optimization equation in real time are provided. The state vector is applied to these equation-solving means. The solution P(x) of the optimization equation is applied to the control system to determine the weights.
It has been found that the optimization equation, such as the state dependent Riccati equation, can be solved, at least substantially, in real time. This yields a solution P(x) for each state vector x(t). This solution is applied to a control system and, therein, determines the weights of the state variables of the state vector x(t) applied also to the control system. The control system generates, therefrom, the optimal correcting variables of the optimal correcting Variables vector uopt(t). With this design of the control system, the data quantities to be processed are smaller. Instead, high computing capacity is required.
The solution of the optimization equation requires a model of the controlled system. This model of the controlled system can be described by an equation
{dot over (x)}=g(x,u,t).
For analytically solving, for example, the state dependent Riccati equation, this function is xe2x80x9cfactorizedxe2x80x9d, i.e. is replaced by a form
xe2x80x83{dot over (x)}=F(x)x+G(x)u,
wherein F and G are matrices depending on the state vector x. This xe2x80x9cfactorizingxe2x80x9d permits only a simplified model of the controlled system. This model may considerably deviate from reality. The real controlled system nearly always contains uncertainties of the parameters and/or non-linearities, which cannot be modeled in this form or which may, sometimes, not even be known.
In order to deal with these problems, an adaptive model of the controlled system is provided, to which the state vector and the correcting variables vector are applied and which provides an estimated value of the of the state vector. A first vectorial training signal for the adaptive model is represented by the difference of the actual state vector and the estimated value of the state vector. A second vectorial training signal for an adaptive network which is provided on the control system side and to which the state vector and the correcting variables vector are applied is derived from the trained model of the controlled system. This network on the side of the control system provides a correction quantity for correcting the optimal correcting variables vector resulting from the solution of the optimization equation at the control system, whereby an actual correcting variables vector to be applied to the controlled system is formed.
Preferably, to this end, the model of the controlled system has a first matrix (F(x)) which is multiplied by the state vector x, and a second matrix (G(x)) which is multiplied by the correcting variables vector (u). The sum of the state and correcting variables vectors multiplied by the respective matrices, representing the time derivative of the state vector, is integrated to provide a model value of the state vector. An adaptive structure is provided, on the side of the controlled system, which provides a correcting value for correcting the model value of the state variable, this adaptive structure being trained by the first vectorial training signal.
According to another solution, a structure trained off-line has an input to which the state vector (x) of the controlled system is applied, and an output which provides a correcting variables vector (u), the correcting variables of the correcting variables vector being applied to the controlled system. The state vector (x) is applied on-line to the equation solving means, whereby the equation solving means provide an optimal correcting variables vector (uopz(t)). The difference of the correcting variables vector (u(t)) provided by the adaptive structure and of the optimal correcting variables vector (uopz(t)) is applied on-line to the adaptive structure as a training signal.
The control system has an adaptive structure such as a neural network, which is trained off-line to represent an optimal control system. This structure provides stable control, which may, however, not be optimal in some ranges of the state space. In addition thereto, the optimization equation such as the state dependent Riccati equation is solved in real time. The correcting variables vector obtained thereby for the respective time-dependent state vector is compared to the correcting variables vector provided by the adaptive structure arid, thereby, serves to continuously further train the adaptive structure xe2x80x9con-linexe2x80x9d.
The weight factors for providing the optimal correcting variables vector may be determined, instead of by the state dependent Riccati equation, also by other optimization equations. The control system may, for example, be a LQ-controller. The described procedure of correcting a model of the control system by means of an adaptive structure such as a neural network and of correcting the optimal correcting variables vector through the control system by means of a second adaptive structure trained by the corrected model of the control system may, if required, also be used with an -already available- control system operating with proportional navigation or extended proportional navigation, in order to improve the control behavior thereof.
The adaptive structure may be a neural network or a fuzzy-neural network or an adaptive fuzzy logic unit. The adaptive structure may also be trained off-line with knowledge about the dynamic behavior of the controlled system in a simulation process.
Embodiments of the invention are described hereinbelow with reference to the accompanying drawings: