1. Field of the Invention
The present invention generally relates to a natural language input method and apparatus to allow for computer usable data to be input by recognising a natural language input which includes pauses.
2. Description of Related Art
When inputting data in a natural language, a user may pause in the natural language input which can adversely affect the recognition of the natural language input.
In particular, in speech recognition which use context free grammars if a user inserts pauses other than at the places expected by a speech recognition engine e.g. at the end of a sentence, the resultant speech recognition accuracy can be adversely affected.
There are many reasons why a speaker may insert pauses during speech input e.g. when emphasising words where the pauses are not properly interpreted by the speech recogniser. Pauses may also occur in the speech input where actions are involved. One particular area in which this occurs is in the field of multimodal data input.
In general pauses or breaks may occur deliberately or inadvertently.
In order to increase the richness with which a user can interact with a machine, it has become common for the user to be able to interact the machine using more than one type of input device, i.e. more than one modality. For example, it is common in speech recognition systems used on general purpose computers to allow a user to input data using a speech recognition engine, and to supplement the input of speech data with mouse data and keyboard data. Multimodal systems combine input modalities such as touch, pen, speech and gesture to allow more natural and powerful communication than any single modality would alone.
When one of the modalities comprises a channel by which natural language can be input, in view of the interaction by a user with more than one modality at the same time, the inputting of data using a second modality can affect the inputting of data using natural language i.e. when a user is inputting data in a second modality, this can cause a delay in the input of natural language. For example, when a user uses a multimodal system for inputting speech and mouse events, the user may pause during speech in order to properly locate the pointer controlled by the mouse in order to generate the mouse event. This pause in the natural language input can in some instances cause errors in the recognition of the natural language input. The reason for this is that some speech recognition systems use context free grammars for the recognition process. A context free grammar defines a whole utterance (i.e. a portion of speech between pauses). Thus a pause appearing in the middle of what the recognition engine expects to be an utterance causes the recognition engine to treat the input speech as two shorter utterances. The recognition engine will thus try to match the two utterances separately to the grammar rules. This causes misrecognition.