The exemplary embodiment relates to a system and method employing a model for proactively proposing an action, given a sequence of observations. The system and method find particular application in connection with dialog management systems, recommender systems, and other systems where a set of past observations is useful in predicting actions.
Automated dialog systems interact with users in a natural language, often to help them achieve a goal. As an example, a user may be interested in finding a restaurant and may have a set of constraints, such as geographic location, date, and time. The system, acting as a virtual, conversational agent, offers the name of a restaurant that satisfies the constraints. The user may then request additional information about the restaurant. The dialogue continues until the user's questions are answered. There are many other applications where dialog systems may find application, such as in customer call centers.
Current task-oriented dialog systems are generally designed to be reactive, with users initiating conversations. See, for example, Jason D. Williams, et al., “Partially observable Markov decision processes for spoken dialog systems,” Computer Speech & Language 21(2):393-422, 2007. Conventional dialog systems maintain a distribution over latent variables composing the state of the current dialog. There are three basic choices faced by the agent: a) let the user continue to speak, b) repeat a term said by the user for implicit confirmation, and c) ask the user to repeat for disambiguation or explicit confirmation. Rule-based systems have been developed to make such choices. See, Timo Baumann, et al., “Evaluation and optimization of incremental processors,” Dialogue and Discourse 2(1):113-141, 2011. Alternatively, they can be formalized as a delayed reward control tasks that can be solved using reinforcement learning. See, Hatim Khouzaimi, et al., “Reinforcement learning for tumtaking management in incremental spoken dialogue systems,” Proc. 25th Int'l Joint Conf. on Artificial Intelligence (IJCAI), pp. 2831-2837, 2016.
Incremental approaches to conversation have involved studying the usage of spontaneous dialog act emission as part of an active comprehension mechanism. See, Gudny Ragna Jonsdottir, et al., “Leaming smooth, human-like tumtaking in realtime dialogue,” Int'l Workshop on Intelligent Virtual Agents, pp. 162-175, 2008.
To enhance the usability of conversational agents, it would be desirable for them to be more proactive. Proactive interaction is defined as the faculty of a conversational agent to spontaneously address the user, independently of user interactions. Such agents could initiate conversations on their own. In such proactive dialog systems, the agent could infer, given a set of observed variables, the pertinence of a given suggestion or piece of conversation and could also be able to learn from user feedback. For example, a conversational assistant in a vehicle could use a voice interface to warn a driver proactively about a potential traffic jam ahead. Similarly, a personal assistant agent could suggest a venue based on the current location of its user and his interests.
One problem in designing a proactive conversation agent is the absence of full feedback. The present system and method enables a conversation agent to infer the quality of its proactive decisions from the partial feedback given by the user in prior interactions.