The present invention relates to access and rendering of information in a computer system. More particularly, the present invention relates to access of information using recognition and understanding.
Recently, technology has been advanced to allow the user to access information on a computer system by providing speech commands. Upon receipt of a user command, the computer system performs speech recognition on the user input and further processes the input in order to ascertain the intent of the user in order that the computer system will perform a desired action.
In some situations, the input provided by the user is incomplete or indefinite, which will require the computer system to solicit further information from the user either in the form of visual or audible prompts. A dialog can thus be established between the user and the computer system, where each takes turns providing questions, answers and/or acknowledgments until the intent of the user is ascertained and an action can be performed. In other situations, creating such a dialog is the preferred mode for interacting with the computer system.
Speech Application Language Tags (SALT) has been introduced to facilitate speech as a viable input/output modality for modern user interface design. The design goal for SALT is to make common speech tasks simple to program, yet allow advanced capabilities with straightforward realization. SALT was designed for many applications. One being, for example, a telephone-based, speech-only application that only interacts with users exclusively through spoken dialogue.
SALT includes speech input and output objects (“listen” and “prompt”), which have a mode design to incorporate technologies to detect the start and the end of the user's turn. Accordingly, many speech applications employ user interfaces that require the user to signal the start of a user turn. Some computer systems include wearable computers, speech enabled modal or multimodal (speech input provided for fields selected by an input device such as a mouse) devices and other eyes-free applications. Nevertheless, in each of these environments, a clean cut definition on the user versus computer system turn in the dialog is still present.
Human conversation however does not generally follow a clean cut, turn-taking dialog between participants. Rather, conversations can include acknowledgements, confirmations, questions by one participant, etc., while the other is providing information that may drastically affect, slightly affect or not even affect the manner in which the speaker is providing information. Human speakers enjoy this natural form of conversation. Likewise, telephone systems employ full duplex technology in order to allow such conversations to take place.
In contrast, dialogue based interfaces employ a rigid turn-taking mode of operation between a user and a computer system, which causes the computer system to wait for the end of the user dialog before processing and taking subsequent action. Although simple feedback, such as visual indications like a series of dots progressing across a computer screen, may provide the user some assurance that the computer system is at least processing something, until the user finishes his/her turn and the computer system responds, the extent of understanding by the computer system is not known.
Accordingly, there is a need for improvements in a computer systems that is based on recognition and understanding. Such improvements would provide a system or method for accessing information that would be easier to use by being more natural for the user.