As the momentum for enabling voice over IP (VoIP) in 3G systems such as UMTS grows, operators will be concerned with maximizing the air interface capacity to maximize revenue. While VoIP simplifies core network design and adds new and valuable services compared to traditional circuit switch (CS) voice, VoIP also inherently adds additional overhead in the form of large headers.
FIG. 1 illustrates a well-known VoIP protocol stack for transmission in UMTS and the associated data at each layer in the stack. As shown, an AMR speech vocoder 10 encodes speech for transmission. For example, using an AMR 7.95 kbps vocoder, the vocoder delivers a 159 bit speech frame every 20 ms. To deliver this speech frame to an IP endpoint, the frame is encapsulated into a real-time transport protocol (RTP) packet at a RTP/RTCP layer, where RTCP refers to the radio transport control protocol. The RTP layer 12 adds 12 bytes of header to the speech frame, and the header conveys information such as sequence number, time stamp, synchronization source ID, etc. In addition, some padding is done for octet alignment.
Next, a user data protocol (UDP) and version 6 internet protocol (IPv6) layer 14 adds, according to UDP, another 8 bytes of header to indicate, for example, source/destination port numbers and a checksum (mandatory for IPv6); and then adds 40 bytes of header (e.g., routing information for each packet) according to IPv6.
Therefore the original 159 bit speech packet becomes 656 bits, which is an overhead of over 300%. Fortunately, it is not necessary to transmit the enormous header for each voice packet over the air interface all the time. 3GPP Release 5 mandates that the robust header compression (RoHC) specified in RFC 3095 be supported in the packet data convergence protocol (PDCP) layer 16 of UMTS. Namely, the PDCP layer 16 in the protocol stack for transmission will include a compressor 17 operating according to RoHC, while the PDCP layer in the protocol stack for reception will include a decompressor operating according to RoHC. As is known, the protocol stack for reception is opposite and complementary to the protocol stack for transmission, and is shown in FIG. 2.
The principle behind header compression is that most of the fields in the RTP/UDP/IPv6 header are static; hence they can be sent once uncompressed at call setup from the compressor on the transmission side to the decompressor at the reception side. Once the decompressor has reliably acquired the static information, the compressor starts sending compressed headers carrying information regarding the dynamic parts of the RTP/UDP/IPv6 header. From the compressed header the decompressor is able to fully reconstruct the RTP/UDP/IPv6 header and pass it on to the peer application. In this way, the large RTP/UDP/IPv6 headers are not transmitted over the UMTS air interface for each voice packet, leading to tremendous savings in capacity.
FIG. 3 illustrates the dynamic and static fields of the RTP, UDP, and IPv6 headers. The RTP, UDP and IPv6 protocols are well-known in the art as are the headers for these protocols and the fields comprising the headers. Accordingly, these protocols, headers and fields will not be described in detail. For RTP/UDP/IPv6, the compressed headers carry information regarding the sequence number, time stamp, M, and X fields in the RTP header, which are the dynamic RTP fields, and carry information regarding the UDP checksum, which is a dynamic field in the UDP header because it depends on the payload. During, for example, uninterrupted speech, the dynamic information in the RTP header can be further compressed in most situations down to a one byte R-0 header. The R-0 header is a well-known compressed header profile set forth by RoHC, and includes a single byte of information. This single byte of information includes 2 bits for packet identification and the 6 least significant bits (LSB) of the RTP sequence number.
As stated above, unfortunately, the 2 byte UDP checksum is uncompressible and sent in every voice packet over the air. While slightly larger headers are sent sometimes to update certain header fields, the vast majority of the time RoHC will operate with just 3 bytes of compressed header (1 byte R-0+2 bytes UDP checksum).
The output of the PDCP layer 16 is sent to the radio link control (RLC) layer 18. The RLC layer 18 may operate in a transparent mode or unacknowledged mode (UM). The unacknowledged mode is used in the packet switched (PS) domain of UMTS, and in this mode, an additional 1 byte of RLC UM header is added to the voice packet. This results in a total overhead of 4 bytes.
Subsequently, the medium access control (MAC)-d layer 20 performs transport format selection and routes the appropriate number of RLC packet data units (PDUs) from the RLC layer to the physical (PHY) layer 24. Unless logical channel multiplexing is used (not considered here), the MAC-d layer 20 does not add additional header overhead. In the case the high speed downlink packet access (HSDPA) channel is used in the downlink direction, packets flow from the MAC-d layer 20 to the MAC-hs layer 22, which performs user scheduling, rate selection, and hybrid automatic repeat request (HARQ). In case the enhanced dedicated channel (E-DCH) is used in the uplink direction, then packets flow from the MAC-d layer 20 to the MAC-e layer 22, which performs rate selection, multiple flow multiplexing, and HARQ. The MAC-hs/MAC-e layer 22 header size is variable, but typically adds approximately 20 bits of header when carrying a single MAC-d PDU. Finally, the physical (PHY) layer 24 adds its own error detection mechanism by attaching CRC bits, and transmits the data packet over the air.
Focusing just on the RoHC overhead of 3 bytes, this results in a 15% overhead in the size of each voice packet for an AMR 7.95 kbps vocoder, and a 20% overhead for AMR 5.9 kbps vocoder. Therefore even with RoHC, there is a significant penalty to pay when carrying voice over IP.