The World Wide Telecom Web (also referred to as Spoken Web) contains interconnected voice applications (called as VoiceSites or voice sites) that can be accessed by any regular phone. In conjunction with existing approaches, a voice site can support speech (spoken word) or dual tone multiple frequency (DTMF) as input modalities.
However, with DTMF, one is restricted to the number of digits on the phone, and remembering the mapping (digit to commands) can become tedious. Also, pressing digits may not be natural to a command (for example, scrolling the scroll bar on a website is more natural than pressing ‘1’ to go down, ‘2’ to go up, etc.). Similarly, with a speech input modality, remembering the mapping (words to commands) can become tedious, such techniques are language dependent, and one is restricted to speech recognition accuracy. Consequently, a need exists for improved means for controlling a voice site through all kinds of phones, independent of the platform.