The present invention relates generally to the field of spoken dialog systems, and more particularly to improving spoken dialog systems by applying text clustering to select portions of unlabeled spoken text as candidates for labeling.
Spoken dialog systems (SDSes) are computer systems that are able to converse with humans using a voice interface. SDSes are rapidly becoming a popular form of human-computer interaction, offering users an efficient means of satisfying information gathering or transactional objectives using natural language. While the details are often highly application-dependent, it is common for a dialog system to employ a spoken language understanding (SLU) module to infer the intent behind a given user utterance. For example, the utterance “How much do I have in my account?” might be mapped by an SLU module to an “account_balance” intent.
SLU modules are commonly built on complex statistical classifiers that require a vast amount of labeled training data (e.g., known mappings from utterance→intent). Such labeled data is typically obtained by presenting utterances to a subject matter expert (SME) for manual annotation.
Active learning is a known strategy for adaptively choosing which unlabeled examples should be presented to an SME for manual labeling. In contrast to passive learning, wherein training examples are randomly selected from the unlabeled collection, active learning chooses which (unlabeled) examples should be manually labeled to best improve the module's underlying model.