As information technologies develop continuously, user interaction technologies are widely applied. As a new-generation user interaction mode following keyboard interaction, mouse interaction, and touchscreen interaction, speech interaction, by virtue of convenience and rapidness, is gradually accepted by the majority of users and has a potential prospect of being promoted on a large scale. For example, there is a growing number of speech-related applications on a smart mobile terminal, and smart television manufacturers are replacing traditional handheld remote controllers by introducing a speech interaction technology.
In the prior art, speech interaction is based on a speech recognition technology, that is, after receiving a speech segment, a speech interaction system first performs content recognition on speech data to obtain a content recognition result and learns a user intent according to the content recognition result, and then, according to the user intent, the speech interaction system performs an operation corresponding to the speech, or returns information corresponding to the speech to an end user.
In a process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
In the prior art, when speech content is consistent, operations performed by or results returned by the speech interaction system are consistent; therefore, there are relatively few forms of responding to the speech content and flexibility is not high.