Internet Protocol (IP) Multimedia services provide a dynamic combination of voice, video, messaging, data, etc. within the same session. By growing the number of basic applications and the media which it is possible to combine, the number of services offered to the end users will grow, and the inter-personal communication experience will be enriched. This will lead to a new generation of personalised, rich multimedia communication services, including so-called “combinational IP Multimedia” services.
The UMTS (Universal Mobile Telecommunications System) is a third generation wireless system designed to provide higher data rates and enhanced services to subscribers. UMTS is a successor to the Global System for Mobile Communications (GSM), with an important evolutionary step between GSM and UMTS being the General Packet Radio Service (GPRS). GPRS introduces packet switching into the GSM core network and allows direct access to packet data networks (PDNs). This enables high-data rate packet switched transmissions well beyond the 64 kbps limit of ISDN through the GSM call network, which is a necessity for UMTS data transmission rates of up to 2 Mbps. UMTS is standardised by the 3rd Generation Partnership Project (3GPP) which is a conglomeration of regional standards bodies such as the European Telecommunication Standards Institute (ETSI), the Association of Radio Industry Businesses (ARIB) and others. See 3GPP TS 23.002 for more details.
The so-called Long Term Evolution (LTE) is being developed as a successor to UMTS by 3GPP. It is hoped that LTE will increase data rates greatly, for example to 100 Mbps.
The 3G (UMTS/LTE) architectures includes a subsystem known as the IP Multimedia Subsystem (IMS) for supporting traditional telephony as well as new IP multimedia services (3GPP TS 22.228, TS 23.228, TS 24.229, TS 29.228, TS 29.229, TS 29.328 and TS 29.329 Releases 5 to 9). Security functions for IMS are specified mainly in TS 33.203, but also in TS 33.178. IMS provides key features to enrich the end-user person-to-person communication experience through the use of standardised IMS Service Enablers, which facilitate new rich person-to-person (client-to-client) communication services as well as person-to-content (client-to-server) services over IP-based networks. The IMS is able to connect to both PSTN/ISDN (Public Switched Telephone Network/Integrated Services Digital Network) as well as the Internet.
The IMS makes use of the Session Initiation Protocol (SIP) to set up and control calls or sessions between user terminals (or user terminals and application servers). The Session Description Protocol (SDP), carried by SIP signalling, is used to describe and negotiate the media components of the session. Whilst SIP was created as a user-to-user protocol, IMS allows operators and service providers to control user access to services and to charge users accordingly. The 3GPP has chosen SIP for signalling between a User Equipment (UE) and the IMS as well as between the components within the IMS.
Whilst IMS has been established with UMTS/LTE access in mind, i.e. where users access the IMS services via UMTS cellular networks, IMS is intended to be used with a variety of access network technologies including technologies defined outside 3GPP. As such, a user can connect to an IMS network in a number of different ways, all of which use the Internet Protocol (IP). Terminals implementing IMS clients (such as mobile phones, personal digital assistants, computers, and Home IMS Gateways) can register directly on an IMS network, even when they are roaming in another network. The only requirement is that they can use IPv4/IPv6 and run Session Initiation Protocol (SIP) user agents. Fixed access (e.g., Digital Subscriber Line (DSL), cable modems, PON, Ethernet), mobile access (e.g. CDMA2000, GSM, GPRS, LTE) and wireless access (e.g. WLAN, WiMAX) are all supported. Other phone systems like plain old telephone service (POTS), H.323 and non IMS-compatible Voice over IP (VoIP) systems, may be supported through gateways.
Considering security, IMS provides security for SIP signalling (subscriber authentication and SIP message integrity) built on ISIM based AKA and IPSec as specified in TS 33.203. The 3GPP organisation is currently conducting a study to define a solution for IMS media security, see TR 33.828. Although there is currently no 3GPP standard to secure media/user plane traffic (e.g. the VoIP traffic itself which is typically carried by Real-Time Transport Protocol (RTP)), in the case where an IMS user uses a mobile access network (e.g. 3GPP WCDMA or LTE), it can be assumed that IMS traffic sent across the access network is reasonably well secured by the underlying access network security (e.g. the air interface security of WCDMA). However, this is not the case where the access network is a public access network such as a WLAN or DSL network. The security of user authentication procedures may also vary greatly between different access network types. For example, as discussed, strong ISIM based authentication may be used in 3GPP access networks, but with only weak, password based (digest) authentication or even “bundled” authentication (relying on Layer 2 authentication, TS 33.178) being used in other access network types.
In order to provide security for IMS users with minimum impact on user terminals, it is proposed to implement an edge-to-access-edge (e2ae) media plane encryption solution. This is illustrated in FIG. 1 where an IMS session between two IMS users A and B is secured by encryption between A and a first edge node EA (via an access router AR) and between B and a second edge node EB (via a cellular network comprising a Base Station Transceiver BST). It is assumed that the media plane between EA and EB is secure as a result of the private nature of the operator network(s). Such an e2ae solution is typically preferred over an end-to-end (e2e) solution, as an e2e solution would require some agreement between user terminals (and possibly access networks) as to key negotiation mechanisms and would therefore be difficult, or even impossible, to implement in practice since A and B and/or their respective networks may not have interoperable security solutions. A further advantage of the e2ae approach is that an operator may easily perform transcoding, rate adaptation, and/or lawful intercept on session data, as the data is transported across its network without encryption (or at least in a form that can be decrypted by the operator).
The e2ae solution is often also preferred over an edge-to-middle (e2m) approach, in which user terminals establish a secure connection to a common “middle-box” M, as such a solution requires that both ends have access to such a middle-box and are able to communicate with it.
One possible e2ae solution is to employ the IETF protocol known as Session Description Protocol Security Descriptions for Media Streams (SDES), IETF RFC 4568.
In this approach, the end users (A and B) randomly select respective keys, KA and KB, and include them in the SIP call set-up signalling (e.g. INVITE, 200 OK). Rather than using the keys to establish e2e media plane security (with at least some of the ensuing disadvantages outlined above), certain Call Session Control Functions (CSCFs) within the IMS “snoop”, i.e. intercept and extract, these keys in the SIP messages and distribute them to the respective edge-entities. Each edge-entity uses the snooped key to secure data exchanged between it and the attached end user (that is entity EA uses key KA to secure data with user A, and entity EB uses key KB to secure data with user B). Note that in practice it may be desirable to use different keys when securing traffic originating at A (traffic from A to EA) and when securing traffic terminating at A (traffic forwarded by EA to A). However, as long as A and EA have at least one shared key, KA, it will be easy for them to derive two (or more) keys from KA by application of a cryptographic key derivation function. The same holds for B and EB of course. SDES is a candidate solution which is considered in the ongoing 3GPP study. While this approach is generic, it has a major drawback in that the SIP signalling itself may not be encrypted and hence the keys KA/KB are available in the clear to any third party as well. Therefore this solution may utterly fail to provide security.