Computer-executable systems have been built to undertake actions in a particular domain. Pursuant to an example, computer-executable systems have been developed that receive instructions in a particular language (e.g., English), interpret such instructions, and then intelligently perform some action as a function of the received instructions. Oftentimes, such computer-executable systems are learned in an automated fashion through utilization of machine-learning technologies. For instance, one or more machine learning algorithms can be provided with instructions for performing a task or collection of tasks, and can further be provided with a format for such instructions. In accordance with a particular example, in the game of chess, an instruction may be to “move the queen from position g5 to a second position g1.” The format for such instruction can also be provided (the movement of the queen from g5 to g1 on the chess board). The learner then produces a learned mapping that is employed in a system for interpreting chess instructions, such that, desirably, if the instruction “move the king from position c1 to position c2” is received by the system, the system can understand that the king is desirably moved from c1 to c2, even if the system has not received such instruction previously.
The example set forth above pertains to gameplay. It is to be understood, however, that in a broader context, systems have been trained to automatically and intelligently perform various types of functions. Such systems include but are not limited to speech recognition systems, navigation systems, control systems, recommender systems, classification systems, etc.
In certain situations, reinforcement learning techniques can be employed in connection with training the system. In reinforcement learning, feedback is given to the system, such that the system can understand if an output result was desirable or undesirable. For instance, continuing with an example pertaining to chess, the system can be provided with policies or randomly choose policies for gameplay. At the end of the game, the system can be provided with information indicating whether the system won or lost a game of chess, and based on that feedback can refine the policies for gameplay.
Due to the training data employed, the policies employed, or the like, the system may be brittle, over-specified, require extra input information, or have other negative characteristics. For instance, overfitting is a common concern in the field of machine learning, where output of a system describes random error or noise rather than an appropriate underlying relationship.