Human-computer dialog is an interactive process where a computer system attempts to collect information from a user and respond appropriately. Spoken dialog systems are important for a number of reasons. First, these systems can save companies money by mitigating the need to hire people to answer phone calls. For example, a travel agency can set up a dialog system to determine the specifics of a customer's desired trip, without the need for a human to collect that information. Second, spoken dialog systems can serve as an important interface to software systems where hands-on interaction is either not feasible (e.g., due to a physical disability) and/or less convenient than voice.
Recently, researchers have investigated the use of reinforcement learning for optimal decision-making in spoken dialog systems. The goal of reinforcement learning algorithms is to learn a policy, a mapping from states to actions, which informs a system what it should do in any represented state of the dialog. In order to use these algorithms, dialog designers have had to either explicitly specify a reward function mapping states of the dialog to numeric values, and/or, conduct usability studies after a base system has been deployed to get numeric values for various states of the dialog from a user's subjective evaluations.