1. Technical Field
Embodiments generally relate to automatic speech recognition systems. More particularly, embodiments relate to gesture-augmented speech recognition systems.
2. Discussion
Conventional dictation solutions may use automatic speech recognition (ASR) to identify words, spoken punctuation and commands in speech input received from a microphone, wherein text is generated from the recognition/identification results. In such a case, the user might use a keyboard and/or mouse to correct the textual ASR output. While such an approach may be suitable under certain circumstances, there remains considerable room for improvement. For example, speaking punctuation and commands can be awkward from the perspective of the user, particularly in a dictation setting. Moreover, the manipulation of a keyboard/mouse to correct the ASR output may conflict with the user's desire to replace those devices with ASR functionality.