1. Field of the Invention
In general, the present invention relates to techniques for training neural networks employed in control systems for improved controller performance. More-particularly, the invention relates to a new feedback control system employing a neural network initially trained on-the-fly on-line or off-line, using a rich set of input (real or simulated data), to emulate steady-state (s—s) in the system that includes a controller connected in parallel with the neural network. Unlike prior attempts to apply neural network techniques to train, and later control, proportional-plus-integral (PI) controllers by conventionally directly adding the output of the neural network to the output of the PI controller, the invention utilizes a unique integral term stuffing technique that uses the learned s—s NN output data to reset the value of a PI control ioop with an expected value. As described herein, the control system and method of the invention uses a novel technique that provides for more rapid response within the system, to changes to set-point and other disturbances in input parameters being measured, without the requirement of constant tiresome manual monitoring, dial tweaking, and system control intervention by a technician.
In earlier work of the applicants experimentation was performed on a control system configured to use a PI controller in parallel with a “reinforcement learning agent” into which temperature set point and other variables (Tai, Twi, Two, fa, and fw), the output of the reinforcement learning agent being added directly to the PI controller output to control a heating coil. For reference, see: Anderson, C. W., et al., “Synthesis of Reinforcement Learning, Neural Networks, and PI Control Applied to a Simulated Heating Coil” (1998); Anderson, C. W., et al., “Synthesis of Reinforcement Learning, Neural Networks, and PI Control Applied to a Simulated Heating Coil” (1997); and Anderson, C. W., et al., “Reinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil.” (1996), from “Solving Engineering Problems with Neural Networks: Proceedings of the Conference on Engineering Applications in Neural Networks”. As explained in the second-listed reference, Anderson, C. W., et al., (1997)—see FIG. 8 as labeled therein—the applicants trained the reinforcement learning agent (here, by way of example a NN) off-line for 1,000 repetitions, called “trials”, of a 500 time-step interaction between the simulated heating coil and the combination of the reinforcement learning agent and the PI controller, to gather data set(s) for augmenting (by direct addition, at point C) the output of the PI controller during periods of actual use to control the heating coil.
In their pursuit to analyze problems related to timely-responsive control of a system comprising a feedback PI controller using trained-NN output, it was not until later that the applicants identified and applied the unique technique of the instant invention, thus allowing for successful recovery to perturbations in system parameters and/or changes to set-point in a manner that returns the system to s—s operation in a more-timely fashion. Accordingly, response and settling times can be decreased significantly, especially when the technique of the invention is applied to a representative experimental system, wherein the NN has been trained with real or simulated data. As will be appreciated (see, especially, FIGS. 1 and 12), when a triggering event (as defined) is detected, the trained-NN output is ‘stuffed’ in place of the integral term of the PI controller equation. Applicants' novel technique can be characterized using the following progression of expressions, written in discrete form, governing the system and method of the invention (as used in Eqn. A, ONN⇄NN, either of which represents the learned neural network output for that set of inputs that includes the information about the disturbance); governing the PI controller:
                              O          τ                =                ⁢                                            K              p                        ⁢                          e              τ                                +                                    K              i                        ⁢                                          ∑                                  j                  =                  0                                τ                            ⁢                                                e                  j                                ⁢                Δ                ⁢                                                                  ⁢                t                                                                                      O                      τ            -            1                          =                ⁢                                            K              p                        ⁢                          e                              τ                -                1                                              +                                    K              i                        ⁢                                          ∑                                  j                  =                  0                                                  τ                  -                  1                                            ⁢                                                e                  j                                ⁢                Δ                ⁢                                                                  ⁢                t                                                                                      O          τ                =                ⁢                              O                          τ              -              1                                +                                    K              p                        ⁡                          (                                                e                  τ                                -                                  e                                      τ                    -                    1                                                              )                                +                                    K              i                        ⁢                          e              τ                        ⁢            Δ            ⁢                                                  ⁢            t                              governing the neural network operating in parallel with a PI controller according to the invention are the following:
                                                                        O                τ                            =                            ⁢                                                                    K                    p                                    ⁢                                      e                    τ                                                  +                                                      K                    i                                    ⁢                                                            ∑                                              j                        =                        0                                            τ                                        ⁢                                                                  e                        j                                            ⁢                      Δ                      ⁢                                                                                          ⁢                      t                                                                                                                                                              O                                  τ                  -                  1                                            =                            ⁢                                                                    K                    p                                    ⁢                                      e                                          τ                      -                      1                                                                      +                                                      K                    i                                    ⁢                                                            ∑                                              j                        =                        0                                                                    τ                        -                        1                                                              ⁢                                                                  e                        j                                            ⁢                      Δ                      ⁢                                                                                          ⁢                      t                                                                                                                                                              O                τ                            =                            ⁢                                                O                                      τ                    -                    1                                                  +                                                      K                    p                                    ⁡                                      (                                                                  e                        τ                                            -                                              e                                                  τ                          -                          1                                                                                      )                                                  +                                                      K                    i                                    ⁢                                      e                    τ                                    ⁢                  Δ                  ⁢                                                                          ⁢                  t                                                                                                                        O                τ                            =                            ⁢                              NN                +                                                      K                    p                                    ⁢                                      e                    τ                                                  +                                                      K                    i                                    ⁢                                      e                    τ                                    ⁢                  Δ                  ⁢                                                                          ⁢                  t                                                                                        Eqn.  A            
As one will readily appreciate, the improvements made by the applicants to their earlier work, include a more-efficient use of a trained-NN (which can be an off-the-shelf learning agent component) by having it sit ‘dormant’ such that it does not contribute to the PI controller until it detects a (pre-defined) change greater-than a preselected amount or magnitude to one or more process condition signals, at which time a switch allows the trained-NN output, ONN, to be ‘stuffed’ into the PI controller causing a detectable more-rapid response of the system to its desired steady state (s-s). The output term, ONN⇄NN, is from a set of data learned by the NN during its training period (using a rich set of input, real or simulated) before on-line control of the process/plant, on-the-fly while the process/plant is being controlled, or some combination thereof. Once a significant-enough change triggers action, ONN for that combination of inputs, as disturbed/changed, is stuffed into a discrete form of the PI's integral expression (Eqn. A). Here, the proportional gain constant, Kp, and integral gain constant, Ki, used can be those determined prior to the disturbance, thus requiring no manual tweaking once the control system has been set up and implemented.
The process condition signals are created using measured systems variables created by, for example, signals from one or more sensors (e.g., in HVAC—heating ventilation air conditioning—this can include one or more sensors/meters to measure airflow, temp of air and water, etc.) and one or more set-points. In operation, a significant change may include a disturbance (% of an initial value) to one of the sensed inputs or a manual change made to a set-point. A range of acceptable change, outside of which is considered ‘significant’ enough to represent a triggering event for NN action (see note in FIG. 1), can be pre-defined according to environment being controlled, sensitivity of measurement sensors being use, and so on.
Where, in their earlier work applicants' had simply added a learned output of a reinforcement learning agent, NN, to an output of the PI controller, the focus of the unique system and method of the instant invention uses a distinguishable technique. The NN and IP controller pair according to the invention, lowers coil (or any other process/plant) response time as well as minimizing the effect of sluggish control experienced when a PI controller, operating alone, encounters a gain state different than the one at which it had been tuned. The dynamic heating coil PDE (partial differential equation) model has been presented herein by way of example only; as this dynamic coil model allows for process predictions made where several parameters are simultaneously varied in any SISO (single-input, single-output), SIMO (single-input, multiple-output) (multiple-input, multiple-output) control environment. While an HVAC implementation has been showcased here, the NN and IP controller pair is handily retrofitted to control a wide variety of processes/plants (whole systems, subsystems, individual components from separate systems, components of a system, and so on), especially those where a s—s controller value can be predicted by a neural network.