Control systems are generally used to automate one or more given functions or operations. For example, there are a number of self-tuning control systems such as robot manipulator controllers that adapt to gravity loads, and engine controllers that change the fuel-air mixture as the temperature varies, among others. One problem with current control systems is that they generally require real-time feedback on task performance to alter their performance into the optimal regime. In many tasks such ground truth feedback is not available at run-time.
There are also a number of supervised learning techniques that can be used to train control systems. Some examples of the supervised learning techniques are Support Vector Machines (SVM), Bayesian Networks (BN), Artificial Neural Networks (ANN), Classification and Regression Trees (CART), perceptrons, and C4.5, among others. However, these approaches are typically applied in a tabula rasa (blank slate) configuration and as a single black box. Therefore, when these conventional supervised learning techniques are used to train control systems it is difficult to bias the systems into a known semi-optimal starting state. Another problem is that it is also difficult to impart useful system structural knowledge, even when it is known in advance. These various drawbacks result in such systems generally requiring quite large amounts of data to learn the underlying function well. Acquiring this voluminous data, as well as the ground truth needed for training the system can be time-consuming and expensive.