A dialog system is a computer system that is designed to converse with a human using a coherent structure and text, speech, graphics, or other modes of communication on both the input and output channel. Dialog systems that employ speech are referred to as spoken dialog systems and generally represent the most natural type of machine-man interface. With the ever-greater reliance on electronic devices, spoken dialog systems are increasingly being implemented in many aspects used of daily life. The increasing demands of such systems require shorter system development cycles and better automatic system development techniques. As a result, machine learning techniques are applied to learn dialog strategies automatically, such as reinforcement learning, supervised learning and so on. These techniques require a significant amount for training data for the automatic learners to sufficiently explore the vast space of possible dialog states and strategies. However, it is often difficult to obtain training corpora that are large enough to ensure that the learned strategies are reliable. One approach to solving this problem is to generate synthetic training corpora using computer simulated users. The simulated users are built to explore unseen but still possible user behaviors. These simulated users can interact with the dialog systems to generate large amounts of training data in a low-cost and time-efficient manner. Previous studies have shown that the dialog strategies learned from the simulated training data often outperform hand-crafted strategies. There are also studies that use user simulation to train speech recognition and understanding components.
While user simulation is generally useful in dialog system training, it has not been extensively used in the system training phase, except in very simple cases, such as testing speech recognition components. However, realistic user behaviors are critical in the testing phase because the systems are evaluated and adjusted based on the analysis of the dialogs generated in this phase. Therefore, it is important that the simulated user input to test the system be as close as possible to actual human input.
In general, present simulated users have rather limited ability to mimic actual human users' behaviors and typically over-generate possible dialog behaviors. While this is not a major problem in training systems, it is a significant disadvantage in testing systems, where improper test results may be due to the over-generated dialog behavior for the inputs, rather than improper operation of the dialog system. Furthermore, present simulated users cannot provide subjective user satisfaction feedback, which is also important to improve tested dialog systems.
What is needed, therefore, is a simulated user component that replaces at least some of the human subjects in the test phase of dialog system development to accelerate system development while still obtaining useful feedback from the system evaluation.
What is further needed is a set of comprehensive evaluation measures that can be used to automatically assess the dialog system.