An IP multimedia subsystem (Internet Protocol Multimedia Subsystem, IMS) is a brand-new multimedia service architecture, which is well recognized as a core technology in a Next Generation Network (Next Generation Network, NGN). The IMS is based on the Session Initiation Protocol (Session Initiation Protocol, SIP) protocol to support fixed access and mobile access, and based on ALL-IP to implement convergence of mobile and fixed networks. The IMS is a critical way to implement convergence of multimedia full services such as voice, data, and video services. The IMS supports diversified access terminals and access networks. A very complex access path may exist between an IMS terminal and an IMS core network. Intermediate network elements including devices, such as a Firewall (Firewall, FW), a Network Address Translation (Network Address Translation, NAT) device, an application Proxy (Proxy) Server, and a service monitoring gateway, may process and control packets (for example: SIP (Session Initiation Protocol)/RTP (Real-time Transport Protocol, Real-time Transport Protocol) packet) bearing the IMS services. This may result in that the IMS terminal cannot normally access the network or implement communication.
At present, a tunnel encapsulation based mechanism is mainly used to implement secure access of the IMS service. As shown in FIG. 1, a Security Tunnel Gateway (Security Tunnel Gateway, STG) is deployed at the entrance of the IMS core network, and a Secure Sockets Layer (SSL) tunnel client is integrated in the IMS terminal. When the IMS terminal is started, the SSL tunnel client establishes an SSL tunnel with the STG, and performs SSL tunnel encapsulation for a SIP/RTP packet according to the process illustrated in FIG. 2, and transmits the encapsulation packet (or called an SSL tunnel packet) to the STG over the SSL tunnel. After the SSL tunnel packet is sent to the STG over the network, the STG recovers the SIP/RTP packet from the SSL tunnel packet according to a process (that is, tunnel decapsulation) inverse to the process illustrated in FIG. 2, and forwards the packet to a media server. Conversely, the packet returned by the media server experiences the same process and is sent to the IMS terminal, and the IMS terminal recovers the packet. In this way, thousands of ports bearing the IMS services are uniformly aggregated to a standard Hypertext Transfer Protocol over Secure Socket Layer (HTTPS) (443) port by using SSL tunnel encapsulation, the intermediate network element devices such as the FW, the NAT, and the proxy are easily traversed, and encryption of the SIP/RTP packets is supported. In this way, functions of port aggregation, data encryption, and information integrity protection are implemented for the communication data. The SIP/RTP packets may be not only encapsulated based on the SSL tunnel, but also encapsulated based on an Hypertext Transfer Protocol (Hypertext Transfer Protocol, HTTP) tunnel or a User Datagram Protocol (User Datagram Protocol, UDP) tunnel.
Specifically, as shown in FIG. 2, the process of SSL tunnel encapsulation of the RTP packet is as follows:
Step S202: Collect and code media data (for example, voice or video data) of IMS services at a sampling period of 20 ms (namely, a sampling rate of 50 packets per second).
Step S204: Encapsulate the media data into an RTP packet by using a virtual protocol stack.
Step S206: Add a tunneling protocol header (Encapsulation Header, abbreviated to Enc Hdr in FIG. 2, which may also be abbreviated to Enc Header) in the header of the RTP packet.
Step S208: Add digest information and packet length padding information (the digest information and packet length padding information are represented by HMAC-Tail in FIG. 2), perform SSL encryption for the entire packet, and add an SSL protocol header (SSL Header, abbreviated to SSH Hdr in FIG. 2) in the header of the encrypted packet to form an SSL record (SSL record or SSL record unit).
Step S210: For transmission over the network, finally encapsulate the SSL record into a TCP (Transfer Control Protocol, Transfer Control Protocol) packet (that is, the SSL tunnel packet) and transmit the packet over the network to the STG.
During the above processing of SSL tunnel encapsulation, each RTP packet forms an SSL record, and a large amount of additional information is added to each RTP packet. This results in that the length of the finally sent SSL tunnel packet is larger than that of the RTP packet so that the bandwidth of a single packet increases sharply.
Using the case where the audio coding format of media data is G.729 as an example, the formula for calculating the length of an RTP packet is: IP (20)+UDP (8)+RTP (12)+Payload (20)=60 bytes, where IP indicates an IP header, UDP indicates a UDP header, RTP indicates an RTP header, and Payload indicates the payload (that is, media data of the IMS service). Therefore, when 50 packets are transmitted per second (that is, the packetization time of an RTP packet is 20 ms), the calculated bandwidth is 60*8*(1 s/20 ms)=24 kbit/s.
After the SSL tunnel encapsulation, the formula for calculating the length of the finally sent SSL tunnel packet is: IP (20)+TCP (20)+SSL Header (5)+Enc Header (16)+RTP packet (60)+HMAC-Tail (28)=149 bytes, where IP indicates an IP header, TCP indicates a TCP header, SSL Header indicates an SSL header, Enc Header indicates a tunneling protocol header, and HMAC-Tail indicates a combination of digest information and packet length padding information. Therefore, when 50 packets are transmitted per second, the calculated bandwidth is 149*8 (1 s/20 ms)=59.6 kbit/s.
It can be inferred that the bandwidth of a single packet increases by 35.6 kbit/s (that is, the additional bandwidth is 35.6 kbit/s), equivalent to a bandwidth increase of 148%, by comparing the packet length before and after the SSL tunnel encapsulation. With a sharp increase of the bandwidth, a higher requirement is imposed on the user access bandwidth, which degrades user experience and service access capabilities.