Current dialog systems often use speech as input and output modalities. A speech recognition function is used to convert speech input to text and a text to speech (TTS) function is used to present text as speech output. In many dialog systems, this TTS is used primarily to provide audio feedback to confirm the speech input. For example, in handheld communication devices, a user can use the speech input for name dialing. Reliability is improved when TTS is used to confirm the speech input. However, conventional confirmation functions that use TTS take a significant amount of time and resources to develop for each language and also consume significant amounts of memory resources in the handheld communication devices. This becomes a major problem for world-wide deployment of multi-lingual devices using such dialogue systems.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments of the present invention.