1. Field of the Invention
The present invention relates to spoken dialog systems and more specifically to a system and method of extracting clauses from received speech to improve spoken language understanding.
2. Discussion of Related Art
Spoken language understanding in human-computer dialog systems must accommodate the characteristic features of human verbal communications. Most notable of such features are: (a) ungrammaticality, (b) presence of dysfluencies like repeats, restarts, and explicit/implied repairs, (c) absence of essential punctuation marks, e.g., end of sentence and coma-separated enumerations and (d) unpredictable word errors introduced by speech recognizers. These features make the word strings generated by the recognizers, or even generated by literal transcription of speech, syntactically and semantically incoherent.
Current spoken dialog systems circumvent these problems by classifying the entire input directly into a limited number of actions that the dialog system can perform. Such techniques work well when there are a small number of actions, such as in the case of call routing systems. However, such systems do not scale well for tasks that require a very large number of classes—e.g., problem-solving tasks—or when fine-grained analysis of the user's utterance is needed.
The tasks of identifying sentence boundaries, speech repairs and dysfluencies have been a focus of speech parsing research for several years. Most of the previous approaches cope with dysfluencies and speech repairs in the parser by providing ways for the parser to skip over syntactically ill-formed parts of an utterance. In more recent work, the problem of speech parsing is viewed as a two-step process. A preprocessing step is a used to identify speech repairs before parsing begins.
What is needed in the art is an improved clausifier that does not constrain speech edits and restarts to conform to a particular structure. What is further needed in the art is an improved clausifier that processes text more efficiently to generate a set of clauses for spoken language understanding.