1. Field of the Invention
The present invention relates to the field of next generation networking (NGN) and more particularly to the deployment and delivery of composite services over an NGN network.
2. Description of the Related Art
Next generation networking (NGN) refers to emerging computing networking technologies that natively support data, video and voice transmissions. In contrast to the circuit switched telephone networks of days gone by, NGN networks are packet switched and combine voice and data in a single network. Generally, NGN networks are categorized by a split between call control and transport. Also, in NGN networks, all information is transmitted via packets which can be labeled according to their respective type. Accordingly, individual packets are handled differently depending upon the type indicated by a corresponding label.
The IP Multimedia Subsystem (IMS) is an open, standardized, operator friendly, NGN multimedia architecture for mobile and fixed services. IMS is a Voice over Internet Protocol (VoIP) implementation based upon a variant of the session initiation protocol (SIP), and runs over the standard Internet protocol (IP). Telecom operators in NGN networks offer network controlled multimedia services through the utilization of IMS. The aim of IMS is to provide new services to users of an NGN network in addition to currently available services. This broad aim of IMS is supported through the extensive use of underlying IP compatible protocols and corresponding IP compatible interfaces. In this way, IMS can merge the Internet with the wireless, cellular space so as to provide to cellular technologies ubiquitous access useful services deployed on the Internet.
Multimedia services can be distributed both within NGN networks and non-NGN networks, alike, through the use of markup specified documents. In the case of a service having a visual interface, visually oriented markup such as the extensible hypertext markup language (XHTML) and its many co-species can specify the visual interface for a service when rendered in a visual content browser through a visual content channel, for instance a channel governed by the hypertext transfer protocol (HTFP). By comparison, an audio interface can be specified for a service by voice oriented markup such as the voice extensible markup language (VoiceXML). In the case of an audio interface, a separate voice channel, for instance a channel governed according to SIP.
In many circumstances, it is preferred to configure services to be delivered across multiple, different channels of differing modalities, including the voice mode and the visual mode. In this regard, a service provider not always can predict the interactive modality through which a service is to be accessed by a given end user. To accommodate this uncertainty, a service can be prepared for delivery through each anticipated modality, for instance by way of voice markup and visual markup. Generating multiple different markup documents to satisfy the different modalities of access, however, can be tedious. In consequence, merging technologies such as the XHTML+VoiceXML (X+V) have been utilized to simplify the development process.
Specifically, X+V represents one technical effort to produce a multimodal application development environment. In X+V, XHTML and VoiceXML can be mixed in a single document. The XHTML portion of the document can manage visual interactions with an end user, while the VoiceXML portion of the document can manage voice interactions with the end user. In X+V, command, control and content navigation can be enabled while simultaneously rendering multimodal content. In this regard, the X+V profile specifies how to compute grammars based upon the visual hyperlinks present in a page.
Processing X+V documents, however, requires the use of a proprietary browser in the client devices utilized by end users when accessing the content. Distributing multimedia services to a wide array of end user devices, including pervasive devices across NGN networks, can be difficult if one is to assume that all end user devices are proprietarily configured to handle X+V and other unifying technologies. Rather, at best, it can only be presumed that devices within an NGN network are equipped to process visual interactions within one, standard channel of communication, and voice interactions within a second, standard channel of communication.
Thus, despite the promise of X+V, to truly support multiple modalities of interaction with services distributed about an NGN or, even a non-NGN network, different channels of communications must be established for each different modality of access. Moreover, each service must be separately specified for each different modality. Finally, once a session has been established across one modality of access to a service, one is not able to change mid-session to a different modality of access to the same service within the same session. As a result, the interactions across different channels accommodating different modalities of interaction remain unsynchronized and separate. Consequently, end users cannot freely switch between modalities of access for services in an NGN network.