1. Field of the Invention
The present invention relates generally to Internet telephony and, in particular, to automatic orchestration of dynamic multiple party, multiple media communications.
2. Description of the Related Art
Telephone companies are investing in, creating, and deploying voice over Internet Protocol (VoIP) infrastructure. Session initiation protocol (SIP) has emerged as the open, de facto standard for inter-device signaling and provides a means for initiating and tearing down media sessions between two endpoints, as well as negotiating the media types supported by the two endpoints. SIP also allows for mid-call changes to the media sessions, as well as the network location of the media packet sources and sinks.
Because of the expected jump in the number of SIP-enabled, the increased functionality offered by SIP devices, as compared to traditional public switch telephone network (PSTN) lines, and the blur in traditional communications media offered by technologies such as video conferencing, speech-to-text, text-to-speech, etc., the VoIP landscape is rapidly growing in both size and complexity.
As opposed to carrying circuit services on top of the Internet protocol (IP), the IP Multimedia Subsystem (IMS) offers operators the opportunity to build an open IP based service infrastructure that will enable an easy deployment of new rich multimedia communication services mixing telecom and data services. Alcatel® IMS solution is based on 3GPP specifications but is also able to provide common IMS services for mobile and other access networks including fixed ones. Before trying to emulate circuit switched domain, IMS first provides new services that are not too demanding for the underlying access network.
While single media collaboration, such as voice conferencing or group instant messaging (IM) chat, can be done on the fly, orchestrating multiple media collaboration among a dynamic set of participants remains largely unsolved, other than by manual, brute-force methods. For example, if a user wishes to have a voice conference with two or more other participants, then a simple single-media session may be initiated. However, if the participants have different media capabilities, then orchestrating the session requires the initiator to coordinate multiple “legs” for the session. For instance, one participant may have voice capabilities while a second participant may have text-only capabilities. The initiator must then set up text-to-voice and voice-to-text conversion for the text-only device to participate in the session. All of these disparate services must be set up and coordinated in order for the session to take place.