In a telephony network, tones may be generated at different places, depending on the switching technology and the nature, or purpose, of the tone. Most tone signals in use in the Public Switched Telephone Network (PSTN) are sequences of simple combinations of sine waves. There is no single international standard for all telephone tones. Third party vendors provide telephony equipment relied upon to detect telephony devices, and telephony tones. Such telephony equipment includes Voice-over-Internet Protocol (VOIP) gateways. A voice platform in a telephony network may interface with the gateway that detects the telephony devices and telephony tones.
In a voice platform, a telephony session service may be responsible for general call processing and interfaces with telephony gateways or telephony hardware boards. A speech server (e.g. voice server) may be responsible for media processing (e.g, speech recognition, playback). A recognizer may be responsible for recognition of speech and Dual-Tone Multi-Frequency (DTMF) grammars.
A voice browser may be responsible for processing Voice Extensible Markup Language (VoiceXML, or VXML) documents and for directing the operation of the speech server. VXML is a standard Extensible Markup Language (XML) format for specifying interactive voice dialogues between a human and a computer. VoiceXML documents are interpreted by a voice browser.
An application may submit instructions to be processed by the voice browser. Audio data received from the telephony network may be packetized and communicated via the Real-time Transport Protocol (RTP), between a telephony gateway and the speech server. In such an architecture, the normal circuit-switched trunk terminates at the gateway.