Spoken language processing systems include various modules and components for receiving speech input from a user, determining what the user said, and acting upon what the user said. In some implementations, a spoken language processing system includes an automatic speech recognition (“ASR”) module that receives audio input of a user utterance and generates one or more likely transcriptions of the utterance. Spoken language processing systems may also include a natural language understanding (“NLU”) module that receives textual input, such as a transcription of a user utterance, and determines the meaning of the text in a way that can be acted upon, such as by a computer application. Spoken language processing systems may also include a dialog manager (“DM”) that manages interaction of a user with the system, prompts the user for information that may be required to execute various applications or perform various functions, provides feedback to the user, etc. For example, a user of a mobile phone may make a spoken command to initiate a phone call. Audio of the spoken command can be transcribed by the ASR module, and the NLU module can determine the user's intent (e.g., that the user wants to initiate a phone call) from the transcription. The dialog manager can prompt the user for any additional information required to initiate the phone call (e.g., what number the user would like to call).
In prompting users for information required to perform various functions, dialog managers can follow pre-determined scripts, follow rules regarding what information is required, etc. Some dialog managers may initiate a series of questions, with each question designed to obtain a particular type of information from the user. For example, when a user would like to schedule a flight, the dialog manager may prompt the user regarding what the user would like to do (DM: “What would you like to do?” User: “Book a flight.”), where the user will be travelling from (DM: “Where would you like to depart?” User: “Los Angeles”), where the user will be travelling to (DM: “Where would you like to go?” User: “Chicago”), etc. Some dialog managers allow a user to provide information, and the dialog manager can prompt the user for any additional required information (User: “I want to book a flight to Chicago.” DM: “Where would you like to depart?”). The user experience and perceived performance of dialog managers and other spoken language processing system features may be defined in terms of the number of prompts to which a user must respond, the accuracy of the system in interpreting user answers to the prompts, and the total amount of time and effort that must be expended to complete a spoken command or query.