1. Field of the Invention
The present invention pertains generally to telecommunications, and particularly to the compression of headers of packets such as media packets.
2. Related Art and Other Considerations
Due to the tremendous success of the Internet, it has become a challenging task to make use of the Internet Protocol (IP) over all kinds of links. However, because of the fact that the headers of the IP protocols are rather large, it is not always a simple task to make this come true for narrowband links, such as cellular links, for example. As an example, consider ordinary speech data transported by the protocols (IP, UDP, RTP) used for Voice-over-IP (VoIP), where the header may represent about 70% of the packet resulting in a very inefficient usage of the link.
The term “header compression” (HC) encompasses the art of minimizing the necessary bandwidth for information carried in headers on a per-hop basis over point-to-point links. Header compression techniques in general have a more than ten-year-old history within the Internet community. Several commonly used header compression protocols exist, such as the following: (1) Van Jacobson. Compressing TCP/IP Headers for Low-Speed Serial Links. IETF RFC 1144, IETF Network Working Group, February 1990; (2) Mikael Degermark, Björn Nordgren, Stephen Pink. IP Header Compression, IETF RFC 2507, IETF Network Working Group, February 1999; and (3) Steven Casner, Van Jacobson. Compressing IP/UDP/RTP Headers for Low-Speed Serial Links, IETF RFC 2508, IETF Network Working Group, February 1999, all of which are incorporated by reference herein in their entirety.
Header compression takes advantage of the fact that some fields in the headers are not changing within a flow, or change with small and/or predictable values. Header compression schemes make use of these characteristics and send static information only initially, while changing fields are sent with their absolute values or as differences from packet to packet. Completely random information has to be sent without any compression at all.
Header compression is thus an important component to make IP services over wireless, such as voice and video services, economically feasible. Header compression solutions have been developed by the Robust Header Compression (ROHC) Working Group of the Internet Engineering Task Force (IETF) to improve the efficiency of such services.
Robust Header Compression (ROHC), as defined in RFC 3095 (Bormann, C., “RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed”, RFC 3095, Internet Engineering Task Force, July 2001), is an extensible framework for which profiles for compression of various protocols may be defined. For real-time multimedia services (e.g. voice, video), the application data is transported end-to-end within an IP/UDP/RTP stream. Header compression of IP/UDP/RTP is defined by the ROHC profile 0x0001 (ROHC RTP) and is applicable for Voice-over-IP (VoIP) services among others. The ROHC RTP header compression scheme has been designed to efficiently compress the IP/UDP/RTP headers over an arbitrary link layer.
A number of other ROHC profiles have also been defined for compression. Among these are (1) IP/UDP/RTP headers (described in: Jonsson, L. and G. Pelletier, RObust Header Compression (ROHC): A Link-Layer Assisted ROHC Profile for IP/UDP/RTP, IETF RFC 3242, April 2002; and Liu, Z and K. Le, Zero-byte Support for Bidirectional Reliable Mode (R-mode) in Extended Link-Layer Assisted RObust Header Compression (ROHC) Profile, IETF RFC 3408, December 2002); (2) IP only headers (described in: Jonsson, L. and G. Pelletier, RObust Header Compression (ROHC): A compression profile for IP, IETF RFC 3843, June 2004; (3) IP/TCP headers (described in: Pelletier, G., Jonsson, L., West, M. and R. Price RObust Header Compression (ROHC): TCP/IP Profile (ROHC-TCP), Internet Draft (work in progress), <draft-ietf-rohc-tcp-08.txt>, October 2004); and (4) IP/UDP-Lite/RTP headers (described in: Pelletier, G., RObust Header Compression (ROHC): Profiles for UDP-Lite, Internet Draft (work in progress), <draft-ietf-rohc-udp-lite-04.txt>, June 2004). All RFCs cited herein are incorporated by reference herein in their entireties.
Except for negotiation (see also Bormann, C., Robust Header Compression (ROHC) over PPP, IETF RFC 3241, April 2002), ROHC profiles only requires framing and error detection to be provided by the link layer, while all other functionality is handled by the ROHC scheme itself.
The ROHC profiles defined in RFC 3095, RFC 3242, RFC 3408, “IP-ONLY” (Jonsson, L. and G. Pelletier, RObust Header Compression (ROHC): A compression profile for IP, IETF RFC 3843, June 2004, and “ROHC-UDPLite” (Pelletier, G., RObust Header Compression (ROHC): Profiles for UDP-Lite, Internet Draft (work in progress), <draft-ietf-rohc-udp-lite-04.txt>, June 2004) all support three different modes of operation. In short, for a specific context, the mode of operation controls the actions and the logic to perform as well as the packet types to use during different states of the header compression operation. Packet types and formats that are allowed may vary from one mode to the other. The Unidirectional mode (U-mode) is used at the beginning of any ROHC compression before any transition to other modes may occur. The Bidirectional Optimistic mode (O-mode) seeks to maximize the compression efficiency and sparse usage of the feedback channel. The Bidirectional Reliable mode (R-mode) seeks to maximize robustness against loss propagation and context damage propagation.
When in U-mode, packets are sent from compressor to decompressor only. The U-mode is thus usable over links where a return path from decompressor to compressor is either not desired or not available. Periodical refreshes are used in U-mode. The U-mode is particularly applicable to broadcast or multicast channels.
The O-mode is similar to the U-mode with the difference that a feedback channel is used to send error recovery requests and (optionally) acknowledgements of significant context updates from the decompressor to compressor. For most ROHC profiles, the U-mode and the O-mode are often indistinctly referred to using the term U/O-mode, due their rather similar characteristics—such as an identical set of packets formats for both modes.
The R-mode differs significantly from the two other modes, mainly by making a more extensive usage of the feedback channel and a stricter logic for performing context updates. The R-mode also uses a few different packet types only understood and useful in this mode.
Each mode of operation has different properties in terms of compression efficiency, robustness and processing complexity. Mode transitions may only be initiated by the decompressor. ROHC does not specify how and when each mode should be used (other than that ROHC compression must always start in U-mode). Therefore, the logic for mode transitions is an implementation decision and may be based on measurements of the link characteristics, link conditions, implementation optimizations for a specific mode or may be based on other algorithms. In particular, for Broadcast/Multicast type of services, header compression operates in the unidirectional mode (U-Mode) only, as normally for such services a feedback channel from decompressor to compressor is not available or desired.
A header compression scheme (such as a ROHC Profile) can be conceptualized and/or realized as a state machine. A challenging task is to keep the compressor and decompressor states, called contexts, consistent with each other, while keeping the header overhead as low as possible. There is one state machine for the compressor, and one state machine for the decompressor. The compressor state machine directly impacts the level of compression efficiency, as it is an important part of the logic controlling the choice of compressed packet type to be sent. The purpose of the decompressor state machine is mainly to provide the logic for feedback (if applicable) and to identify the packet types for which decompression may be attempted.
A compression context contains and maintains relevant information about past packets, and this information is used to compress and decompress subsequent packets. As explained in the ROHC documentation, the context of the compressor is the state it uses to compress a header. The context of the decompressor is the state it uses to decompress a header. Either of these or the two in combination are usually referred to as “context”, when it is clear which is intended. The context contains relevant information from previous headers in the packet stream, such as static fields and possible reference values for compression and decompression. Moreover, additional information describing the packet stream is also part of the context, for example information about how the IP Identifier field changes and the typical inter-packet increase in sequence numbers or timestamps.
For the ROHC profiles defined in RFC 3095, RFC 3242, RFC 3408, “IP-ONLY” (Jonsson, L. and G. Pelletier, RObust Header Compression (ROHC): A compression profile for IP, IETF RFC 3843, June 2004) and “ROHC-UDPLite” (Pelletier, G., RObust Header Compression (ROHC): Profiles for UDP-Lite, Internet Draft (work in progress), <draft-ietf-rohc-udp-lite-04.txt>, June 2004), FIG. 1 shows the compressor state machine. For ROHC compression, the three compressor states are the Initialization and Refresh (IR), First Order (FO), and Second Order (SO) states. The compressor starts in the lowest compression state (IR) and transits gradually to higher compression states. The compressor will always operate in the highest possible compression state, under the constraint that the compressor is sufficiently confident that the decompressor has the information necessary to decompress a header compressed according to that state. See, e.g., RFC 3095, section 4.3.1 (Carsten Bormann, et al. RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP and uncompressed; IETF RFC 3095, April 2001). In particular while operating in U-Mode, decisions about transitions between the various compression states are normally taken by the compressor on the basis of variations in packet headers and periodic timeouts.
According to RFC 3095 defines the Initialization and Refresh (IR) State, in section 4.3.1, the purpose of the IR state is to initialize the static parts of the context at the decompressor or to recover after failure. In this state, the compressor sends complete header information. This includes all static and nonstatic fields in uncompressed form plus some additional information. The compressor stays in the IR state until it is fairly confident that the decompressor has received the static information correctly.
The IR state is thus the state were the compression level is the lowest. FIG. 2, taken from RFC 3095, section 5.3.1, describes the U-Mode state machine. In the U-mode state machine of FIG. 2, Timeout—1 typically corresponds to a periodic sending of the static (and possibly also dynamic) parameters of the decompressor context, while Timeout—2 typically corresponds to a periodic sending of only the dynamic parameters of the decompressor context.
In addition, the context replication (CR) mechanism for ROHC profiles introduce an additional state, the CR state. See, Pelletier, G., Robust Header Compression (ROHC): Context replication for ROHC profiles, Internet Draft (work in progress), <draft-ietf-rohc-context-replication-01.txt>, October 2003. Todate, only the [ROHC-TCP] profile specifies support for context replication, but other profiles may also support it provided their corresponding standard is updated. The CR state may also be used by a profile operating in U-Mode. FIG. 3 shows the logic added to the previous state machine for the CR state. In U-Mode, downward transitions are performed according to the same logic as described above.
FIG. 4, taken from RFC 3095, section 5.3.2, illustrates an example U-Mode decompressor state machine. The state of the decompressor dictates what type of compressed packet may be decompressed. In the No Context (NC) state, only packets initializing the static part may be decompressed (e.g. ROHC IR packets). In the Static Context (SC) state, only packets containing sufficient information on the dynamic parameters (e.g. ROHC IR-DYN or UOR-2 packets) may be decompressed. In the Full Context (FC) state, any packet may be decompressed. Thus, depending on the condition of the channel and on the success rate of the decompression, the decompressor state machine will transit between the different states and will have to wait for the reception of a suitable packet for attempting decompression.
In unidirectional operation, there is no feedback sent back to the compressor. Therefore, in unidirection operation, the decompressor may (in the worst cases) have up to Timeout—1 of waiting time without possibility to start decompression of the received packets, and up to Timeout—2 before it can re-start compression after severe context damage to the dynamic information.
Broadcast and multicast services differ from unicast services in that they do not specifically target a single receiver, but are rather forms of transmission where multiple recipients will receive the service. Unicast transmits to an address (either network or link-layer address) corresponding to one and only one receiver. Broadcast and multicast, on the other hand, use addresses shared by a number, or a group, of receivers. A broadcast is generally a transmission that can be received by anyone who can tune to the channel, while multicast is a transmission between a sender and multiple specific receivers on a network.
Of particular interest for such services is the robustness characteristics of the header compression scheme over a channel with relatively high bit error rates, with no or limited link retransmissions and with no or limited feedback capability. With respect to this, ROHC U-mode has a clear advantage when compared to other existing header compression schemes such as RFC 2508 and RFC 3545 (Koren, T., Casner, S., Geevarghese, J., Thompson B. and P. Ruddy, Enhanced Compressed RTP (CRTP) for Links with High Delay, Packet Loss and Reordering, IETF RFC 3545, IETF Network Working Group, July 2003).
The 3GPP2 BCMCS Framework (“Broadcast-Multicast Service (BCMCS) Framework Draft Document”, Version 1.2, 3GPP2 BCMCS ad-hoc group, May 2003) provides an architectural overview and a framework description of the Broadcast-Multicast Service (BCMCS) for the cdma2000® networks (cdma2000® is a registered trademark of the telecommunications Industry Association (TIA—USA)). Similarly to the BCMCS work item in 3GPP2, MBMS (Multicast/Broadcast Multimedia System) is currently part of the Release 6 “Introduction of the Multimedia Broadcast Multicast Service (MBMS) in GERAN” work item within 3GPP standardization. In the standard, an MBMS radio bearer is defined as either “point-to-point” (p-t-p) or “point-to-multipoint” (p-t-m). Also, an MBMS channel is defined to comprise the physical resources assigned to one or more MBMS services. MBMS channels carry either an MBMS p-t-p radio bearer or an MBMS p-t-m radio bearer. It is foreseeable that MBMS will also use IP header compression (with ROHC) in the same manner as 3GPP2's BCMCS. These standardization efforts strive to allow optimization of the use of the existing radio interfaces for delivery of broadcast/multicast content stream(s) to one or more terminals in one or more regions of an operator's network.
There are alternative proposals for improving BCMCS access [QC-ROHC]. For cdma2000® networks, a Qualcomm proposal, abbreviated as “QC-ROHC” advocates the use of ROHC in unidirectional mode as the preferred header compression algorithm for BCMCS services, but with modifications to the ROHC unidirectional mode of operation for header compression in BCMCS. See, “Header Compression for BCMCS”, Haipeng Jin and Jun Wang, QUALCOMM Incorporated, contribution to 3GPP2 BCMCS ad-hoc group, October 2003. The QC-ROHC proposal claims that the existing unidirectional mode of operation in ROHC does not work efficiently enough when used over broadcast links with significant error rates and scarce bandwidth. QC-ROHC proposes that static context information be sent in advance to the decompressor via BCMCS information acquisition, on a separate channel. Thus, QC-ROHC proposes to entirely disable the ROHC IR state when operating in U-mode in BCMCS services, and to send the IR parameters out-of-band instead—only once during channel information acquisition. If a decompressor requires the static part of the context, the mobile terminal should initiate a new registration to the service to trigger a new channel information acquisition exchange.
The QC-ROHC proposal thus requires significant changes to the state machine logic, as well as an unnecessarily complex interaction between the header compression algorithm and the underlying system. Also, QC-ROHC is limited to one IP multicast/broadcast flow per ROHC instance (ROHC channel). This can pose unnecessary constraints on the processing and memory usage required in the terminal. Moreover, QC-ROHC may not be suitable for 3GPP's MBMS framework and architecture, as access to the MBMS service may differ fundamentally. Simpler approaches would be preferable, in particular those approaches that may improve the access to the service without bringing additional constraints to the system.
A service called Push to Talk over Cellular or Instant-Talk-over-Cellular (PoC) will likely be applied in handsets for GSM, EDGE, UMTS and CDMA systems. Push to talk over Cellular (PoC) is currently being standardized and agreed upon in an industry consortium known as the Open Mobile Alliance (OMA) forum. See, e.g., http://www.openmobilealliance.com/tech/wg_committees/poc.htm. PoC is basically a voice chat for cellular telecommunication systems. PoC provides quick one-to-one or group communication, providing something like a short instant messaging service which feels like “walkie talkies”. PoC enabled handsets will most likely be equipped with a PoC button (hardware or software). When this button is pressed, the handset is connected directly to a friend, a family member, or even a whole group of people, one-to-one or one-to many. Like a “walkie-talkie” the PoC service is half-duplex, although full duplex may be available at a later stage. It is important to have low setup delay in order to allow for the user to start speak immediately after pressing the button. It is also important that the PoC service be supported in an efficient manner in the radio network since it is expected to be cheaper than circuit switched voice, and since it is likely to become a mass-market service with high penetration.
A typical usage of PoC is for a group of persons (e.g. youths, or professional workers at a building site) use the PoC terminals to keep the group updated on what is on-going. The group participants may be geographically co-located, for example. Current non-PoC systems use one dedicated radio channel (and core network) resource per group participant, which obviously is costly in terms of both radio and core network resources. It is thus foreseeable that the PoC or comparable service may be used over a multicast service.
When operating in U-Mode, efficiency and delay to decompressor full context (i.e. a part of the access to the channel) is limited in view of the tradeoff between (1) the frequency of context updates (e.g. compressor downward transitions) for the purpose of maintaining synchronized contexts at both ends of the link; and (2) the time for a decompressor having no suitable context to (re)synchronize with the compressor context (e.g. when acquiring the broadcast/multicast channel or after a burst of errors leading to damage to the dynamic part of the context).
The context initialization phase (IR state) normally requires the compressor to start using the lowest compression state. Initially, the transmitted packets contain the information necessary to initialize at least the static part, and can also include the dynamic part of the context.
The compressor must then have enough confidence that the decompressor has the proper context before a transition to a higher compression ratio takes place. This confidence may be achieved in U-mode by sending a number of context initialization packets repeatedly for a large enough interval (an optimistic approach). The use of a number of packets may achieve confidence in less than one round trip time (RTT) but cannot absolutely guarantee that the decompressor does have the proper context other than optimistically expect to be successful with a high percentage rate.
In addition, to ensure robustness, a compressor operating in U-mode periodically transits back to a lower compression state (e.g. to IR state with period Timeout—1, or to FO state with period Timeout—2). Transition back to a higher compression state then follows the optimistic approach similarly to the context initialization procedure.
With respect to ROHC U-mode, the periodical refreshes can be seen as two independent processes, namely context initialization and context refreshes. In context initialization, the information conveyed contains both static and dynamic header information (e.g. ROHC IR packets). In context refreshes, the information sent to the compressor comprises dynamic header information (e.g. ROHC IR-DYN packets), or contains both static and dynamic header information (e.g. ROHC IR packets).
The periodical context refreshes can be further separated based on static fields and dynamic fields. For static fields, the periodical updating of this information is performed less often, and is mainly for initializing an un-initialized context rather than for “refreshing” information being static (e.g. ROHC IR packets sent with period Timeout—1). For dynamic fields, the periodical updating is performed more frequently, and is mainly useful to provide means for the decompressor to recover from severe context damage (e.g. ROHC IR-DYN packets sent with period Timeout—2, ROHC IR packets may also be used).
More specifically for the ROHC framework [ROHC], context initialization requires the compressor to start using the lowest compression state, the Initialization and Refresh (IR state). The first transmitted packets are IR packets to initialize at least the static part (and maybe also the dynamic part) of the context. The static part may include information such as Context Identifier (CID), compression profile, the IP source and destination addresses, the UDP source and destination ports, SSRC etc. The dynamic part includes information such as RTP sequence number (RTP SN), payload type, timestamps, timestamp stride etc.
The ROHC framework requires that initialization first use a number of IR packets, and then possibly followed by a number of IR-DYN (IR-dynamic) packets. The size of these packet types, excluding the payload bits, is in the order of tens of octets.
Initialization and periodic refreshes of a header compression context thus require bandwidth for the bits necessary to be exchanged between compressor and decompressor, and this step is necessary to ensure that higher compression efficiency may be achieved. The confidence from the compressor that the decompressor has achieved proper context implies a certain delay for which the compression efficiency is far from optimal. In some situations, such as real-time VoIP flows over very narrow bandwidth wireless links using 0-byte header compression algorithms, for example, such delay may impact perceived quality until optimal compression efficiency is reached. While the impact for a constant flow is minimal and concealed to the first packets of the flow, it may be more significant for a bursty and discontinuous flow, and should be minimized.
When used over error prone unidirectional links such as wireless broadcast links, a ROHC compressor operating in unidirectional mode (U-mode) faces a trade-off between efficiency and reliability. More specifically, when improving spectral efficiency of header compression operating in a unidirectional mode, both the reliability of the context initialization and the delay to reach the static context state (or full context) at the decompressor must be considered. In other words, two types of delay parameters must be considered. The first type of delay parameter (delay to decompression context initialization) is the delay from the time when the mobile station (MS) joins a service flow on the unidirectional channel (no decompression context exists for that flow) to the time the decompressor in the mobile station (MS) can start decompression (having obtained at least the static context information). The second type of delay parameter (delay to decompressor context repair) is the delay from the time when the decompressor has invalidated the context, or parts of the context, to the time when a packet suitable for re-synchronizing the context is successfully received.
All packets received prior to the time at which the decompressor context is successfully initialized must typically be discarded, leading to a longer access time to the service. When the periodic transition to initialization and refresh (IR) state in the compressor (Timeout—1) is set to a long interval, fewer large IR packets are transmitted, leading to higher bandwidth efficiency. However, the time to initially access the channel and initialize the context for the new flow increases (up to Timeout—1 in the worst case). On the other hand, if the periodic transition to the IR state in the compressor (Timeout—1) is set to happen with a short interval, the decompressor will be able to initialize its context promptly, achieving a lower access time to the service flow. However, the large number of IR/IR-DYN packets sent will lead to lower efficiency.
Since wireless links typically have high error rate, there is a fairly high probability for the transmitted packets to be corrupted and cause repeated decompression failures at the decompressor. Once the decompressor is forced away from Full Context (FC) state by such failures, the decompressor may have to wait for a certain time until it receives the periodic IR or IR-DYN packet(s) from the compressor necessary to re-establish or repair the context. Most or even all packets received during this interval may have to be discarded, causing disruption in the service. Thus, when the periodic transition to First Order (FO) state in the compressor (Timeout—2) is set to a long interval, fewer moderately large IR-DYN (or large IR) packets are transmitted, leading to higher bandwidth efficiency. However, the time to repair a partially damaged context for the flow increases (up to Timeout—2 in the worst case). On the other hand, if the periodic transition to the FO state in the compressor (Timeout—2) is set with a short interval, the decompressor will be able to recover from a loss in dynamic context promptly, achieving higher reliability.
Media is typically provided with media encoding information. As just one illustrative example, consider MPEG video compression. The MPEG-4 Configuration information comprises a Visual Object Sequence Header, a Visual Object Header, and Video Object Layer Header. As stated in RFC 3016: “The Combined Configuration/Elementary stream mode MUST be used so that configuration information will be carried to the same RTP port as the elementary stream (see 6.2.1 “Start codes” of ISO/IEC 14496-2 [2][9][4]) The configuration information MAY additionally be specified by some out-of-band means” (Kikuchi, Y., et al., “RTP Payload Format for MPEG-4 Audio/Visual Streams”, RFC 3016, Internet Engineering Task Force, November 2000).
If the first configuration information and/or an intra frame are sent prior to the decompressor having established its context, the header decompressor will discard these frames due to unsuccessful header decompression. The receiver will have to wait for (at least) the next sending of the configuration information and/or an intra frame before it can start rendering the media.
The MPEG encoded video uses three different frame formats: Format I frame (Intra frames), which are coded as a still image, provide random access, and have only moderate compression; Format P frames (predicted frames), which are predicted from the most recently decoded I or P frame; and Format B (bidirectional frames), which provide the highest amount of compression but require the closest two I or P frames for prediction. A lost “I” format frame affects at least a whole group of pictures (GOP), and B frames require an additional P frame to be correctly decoded.
As explained above, there is a trade-off in bandwidth efficiency when frequently sending IR packets. For broadcast/multicast services using ROHC in U-mode, it is desirable to ensure a short access time to the IP service (including fast context initialization) as well as a timely context recovery to prevent additional disruption in the service caused by the header compression algorithm. This should be done while minimizing the overhead introduced by the header compression algorithm, whose purpose is to ensure reliability in the absence of a feedback channel between the decompressor and the compressor.
What is needed, therefore, and an object of the present invention, are method and apparatus for header compression which achieve a suitable balance between bandwidth efficiency and short access/recovery time.