The present invention is related to the field of user input to computer applications, and in particular to the use of a single input device such as a telephone keypad to provide user input to multiple concurrent computer applications.
Many communications and information services are enabled by telephone access to computer applications. A common mode for interacting with such computer applications is through a touch-tone telephone. Services often connect multiple applications to the user. For example, it is possible for a voice messaging service to be accessed through a pre-paid service. However, this creates a modality problem. For example, the pre-paid platform might wish to know when the user presses and holds the “#” key for a relatively long time (this being referred to as both the “long pound” and the “long octothorpe”), while the voice messaging platform might wish to know when the user enters digits, such as for menu navigation. The modality problem is that all digits entered by the user today get sent to both applications, as the digits are sent and both applications listen to the bearer channel. Each application must be prepared to receive and discard notifications of key presses in which it has no interest, complicating the design of the application as well as wasting the use of processing and communications resources during operation.
Numerous applications have been deployed for use in conjunction with the traditional time-division-multiplexed (TDM)-based telephone network. In many cases the applications simply receive the TDM-based “in-band” media stream, i.e., the voice channel, and the applications are responsible for continually decoding the media and monitoring for the presence of certain user input of a signaling nature, such as tones indicating that a particular key on the telephone keypad has been pressed. Certain improvements to the TDM network have been made, such as the Advanced Intelligent Network (AIN), which have the goal of separating signaling traffic from media traffic. However, in practice most of the application logic resides in an “intelligent peripheral” that is coupled along the media path, because of the low-level, device control nature of the associated protocols. There are too many messages with too short a latency budget for a total separation of application logic from the Intelligent Peripheral. The result is that application developers write their applications for deployment on the intelligent peripheral, usually with proprietary intelligent peripheral languages. Thus, AIN does not fulfill the promise of separating application logic from media processing.
The situation is more complex for the packet-switched environments, such as Voice-over-IP. Existing approaches such as H.248.1 (MEGACO), Session Initiation Protocol (SIP) and an in-band technique described in RFC 2833 are described in turn.
H.248.1 (MEGACO) has a provision for reporting key press digits detected or generated by an “endpoint”, which in H.248.1 is a media gateway (MG). The MG can be an IP phone, an access gateway, or a trunking gateway. In the case of the IP phone, the IP phone can transmit the key presses directly at the protocol level. In the case of a gateway, the gateway can detect the key presses using DTMF detectors. Media Gateway Control Protocol (MGCP) is a proprietary Cisco protocol that operates in much the same manner as H.248.1. These protocols employ a master-slave approach in which a Media Gateway Controller (MGC) commands the MG (using a device control protocol signaling link) to connect a tone detector to an incoming circuit and wait for a digit map match. When the MG detects a key press pattern of interest, it notifies the MGC over the same signaling link, returning the actual digit string detected.
In H.248.1, however, one and only one MGC may control the resources in an MG. Applications that have an interest in user signaling must be a part of the MGC application—there is no provision for independent, third-party applications to receive user signaling information. In MGCP, a first MGC may “pass off” control to a second MGC, but one and only one MGC may control a resource at any given time. The limitation of one and only one controller controlling a resource is a direct result of the master/slave nature of the MGCP and H.248.1 protocols. That is, the protocol requires the MG to be in an exclusive relationship to an MGC. Although these protocols also allow for “virtual MGs” within a physical MG, in which case there may be multiple MGCs serving as masters to the set of virtual MGs in a single physical MG, the virtual MGs are simply partitions of a physical MG. There is no provision for enabling multiple independent applications to selectively obtain user signaling information from a single stream of user input.
It has been proposed that a peer-to-peer protocol such as the Session Initiation Protocol (SIP) be used to transport key press signaling, such as via the SIP INFO method. The proposed mechanism closely follows the protocol of MGCP and H.248.1, including the use of MGCP and H.248.1 messages for specifying digit maps and notifications. However, the proposals have envisioned only a single application requesting notifications, which is a result of there being no mechanism for addressing endpoints of interest.
Cisco Systems has introduced a method for transporting DTMF digits using SIP in the SIP signaling path using the SIP NOTIFY method. However, this method has a number of disadvantages. First, notifications can only go to a single egress gateway—it is not possible for a third-party application to register for notifications. Second, the egress gateway receives notifications of every DTMF digit, whether it has an interest in them or not. Third, there is no provision for selectively passing through or clamping the DTMF tones from the media stream. If the ingress gateway passes DTMF, there is the risk of network elements interpreting both the in-band DTMF and the corresponding DTMF signaling received via the NOTIFY mechanism, potentially resulting in incorrect operation.
Another proposed method of transporting key press signaling is to use in-band representations for the keys. For example, RFC 2833 describes transporting key presses as named events, rather than as digital waveforms representing the key presses. While this approach uses less bandwidth and processing resources in the media path, it has serious drawbacks that limit its usability. First, a point-to-point media relationship between the endpoint and the application is generally assumed, leaving no provision for third-party involvement with collecting digits. Although in theory RFC 2833 could be used with third-party applications, rather complicated and unrealistic setup and operation are required. Additionally, because of the particular way that RFC 2833 handles redundancy, it does not meet the reliability requirements for signaling traffic. Moreover, RFC 2833 uses more bandwidth than is necessary, by sending multiple copies of the same packet for normal, lossless operation. Finally, applications receive all key presses, whether they have an interest in the key presses or not, making for inefficient use of communication and processing resources.
Thus what is needed is an efficient system and method for detecting and notifying applications of a single mode of user input, such as user key presses, where the user input signaling follows a signaling path distinct from the media path, and that provides for multiple, independent applications to receive the signaling independently.