Field of Invention
The present invention relates to systems and methods for assisting a user in retrieving information using a conversational interface, and, more specifically, related to techniques of using speech disfluencies during speech input to assist in interpreting the input.
Description of Related Art and Context of the Invention
Almost all languages (at least the modern era versions of them) have a repertoire of punctuation marks to disambiguate the meaning of sentences and to imbue them with emotion (the emergence of emoticons further adds to this repertoire). Expression of intent for information retrieval by written text can tap into this repertoire, particularly for demarcating title and phrase boundaries. It is not uncommon for even an average search engine user to demarcate titles or phrases with quotation marks, in search inputs, to disambiguate and retrieve desired results optimally. Punctuation marks serve, in many instances, to completely change the semantic interpretation of a sentence. For instance, as FIGS. 1 and 2 show, a parser outputs different parse trees, and ascribes different meaning to terms by the presence of quotation marks.
Expression of intent for information retrieval by speech, however, has just a couple of choices available to augment speech—intonation and pauses (facial expressions and gesticulations are only meaningful when listener has viewing capability too). While intonation is very effective for expression of mood or user's subjective state, it may not be an effective speech augmentation for information retrieval when the listener is a machine and not a human. Although some responsive IVR (interactive voice response) systems may detect drastic changes in intonation, pitch increase in particular, as a cue to promptly delegate further interaction with the irked customer to a human representative, this technique does not involve inferring user's expression of intent.
One remaining analog for punctuations in speech input in information retrieval are pauses or short periods of silence within a sentence. Perhaps it is the very paucity of “punctuation equivalent” options in speech that have led humans, in languages such as the English language, to devise additions to a pause, where the additive words/phrases accompanying the pause are interspersed within the sentence. For instance, a journalist reporting on a speech by a politician, would speak, “Mr. X responded to the mounting accusations and I quote<pause> these accusations have no basis . . . <pause>end quote”. If the same sentence was reported in writing, the journalist would have just written—in response to the mounting accusations Mr. X said, “These accusations have no basis . . . ”.