Programming a dynamical system such as a robot to do a large number of tasks under every possible set of constraints is unrealistic. Inspired by humans' remarkable imitation learning capability, researchers introduced Learning from Demonstration (LfD) or Imitation Learning (IL) which learns the policies of a dynamical system performing a task from the demonstrations performed by experts. The problem is similar to what is targeted by Reinforcement Learning (RL), in which the policies are learned through trial-and-error, i.e. letting the dynamical system execute certain policies in the environment under certain constraints and observe corresponding outcomes. The two problems differ in that it is not assumed that it is possible to execute a policy and observe outcome during learning.
LfD approaches could replace programming for a learned task under those constraints that exist in the demonstrations. However, it is possible that the learned policy does not transfer when the constraints change. Since it is unrealistic to demonstrate a task for every possible set of constraints, it is desirable to devise LfD approaches that can learn generalizable policies.