Modern communication systems facilitate human-to-human communication at a distance. Compared with other information-bearing signals, direct human/human interaction, such as face-to-face speech, is relatively lag-free, noise-free, even, and stable. Accordingly, typical communications systems that support human/human speech are configured to minimize, or at least reduce, signal distortions such as lag, noise, unevenness, and instability.
One popular approach to communication-at-a-distance is network telephony. Generally, network telephony systems exchange digital signals that represent speech. In some network telephony systems, the digital signals are the result of processing captured analog speech signals. In other network telephony systems, one or more of the digital signals are created by a machine as an original signal.
Network telephony has grown in popularity, in part because of the advent of a standard protocol, the “Session Initiation Protocol (SIP).” Generally, some SIP systems follow certain protocols to establish and maintain communications links for human-interactive media. In some cases, the human-interactive media is provided by software systems that create synthesized speech for delivery to the human user. In other cases, the human-interactive media is speech generated by another human, digitized and transmitted on the network. Thus, SIP systems support both human/human and human/machine communication.
These systems, however, suffer from numerous drawbacks with regards to intelligibility and naturalness. For example, humans are generally adept at filtering out signal distortion in ordinary face-to-face human/human communication, in part because visual information such as body language provides additional context. Human/human communication is also delay-free, allowing for a natural pace of conversation.
In human/machine communications, however, signal distortions can cause significant problems in both signal processing and content extraction. For example, unstable connections, suffering from high drop-out, can greatly degrade human/machine communication. Delays in the human/machine communication can also greatly degrade the quality of the interaction. Keeping the initial setup period short is key to a natural user experience, requiring efficient connection setup of a link between the human interface device and the machine hosting the target application for human/machine communication.
In typical prior art systems, the machine-side system sets up a communications link for human-interactive media by “polling by options.” In the polling by options approach, a central machine sends options (e.g., available, busy, etc.) to the machines that host the target applications. These machine states are probed by the central machine at regular intervals, which increases network traffic and introduces lag time on the network. Further, periodic polling does not always result in up-to-date machine state information. And, generally, up-to-date machine state information is needed to quickly and effectively select appropriate resources to respond to a request. Without up-to-date machine state information, the call setup process for a human/machine communications link can be so untimely as to cause unsatisfactory connection times and/or increased signal distortion, leading to impaired communications.