This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.
Multimedia applications include services such as local playback, streaming or on-demand, conversational and broadcast/multicast services. Technologies involved in multimedia applications include, among others, media coding, storage and transmission. Different standards have been specified for different technologies. In video communication systems with fluctuating bandwidth demands, in particular, the use of layered coding is beneficial. For example, this feature may be particularly beneficial in video-enabled mobile phones that can cope with changes in connection speed during the lifetime of a session. Such changes may be necessitated, for example, due to a fallback from Wireless Local Area Network (WLAN) to third generation (3G) networks or from 3G networks to Global System for Mobile communications (GSM) networks. In layered coding, a base layer is selected to be conveyable over even the slowest of links. Increased video quality is made possible by adding additional “enhancement” layers of video, which are conveyed over faster access technologies.
The most recent work related to video standardization is the extension of ITU-T Recommendation H.264 with a layered coding concept. This work is commonly known as “Scalable Video Coding” or SVC. The latest draft of the SVC standard is described in JVT-X201, “Joint Draft 11 of SVC Amendment,” 24th JVT Meeting, Geneva, Switzerland, June-July 2007, available from International Telecommunication Union (ITU) webpage and incorporated herein by reference in its entirety.
In layered coding arrangements, one can commonly observe a hierarchy of layers. For a given higher layer, there is typically at least one lower layer upon which that higher layer depends. When data from the lower layer is lost, the data of the higher layer becomes much less meaningful, and completely useless in some circumstances. Therefore, if there is a need to discard layers or packets belonging to certain layers, it makes sense to first discard the higher layers or packets belonging to the higher layers or, at a minimum, to perform such discarding before discarding lower layers or packets belonging to lower layers.
This layered coding concept can also be extended to multiview video coding (MVC), where each view can be considered as a layer, and each view can be represented by multiple scalable layers. In multiview video coding, video sequences output from different cameras, each corresponding to a view, are encoded into one bitstream. After decoding, to display a certain view, the decoded pictures belonging to that view are displayed. The latest draft of MVC is described in JVT-X209, “Joint Draft 4.0 on Multiview Video Coding”, Geneva, Switzerland, June-July 2007, available from ITU webpage and incorporated herein by reference in its entirety.
Layered multicast is a transport technique for scalable coded bitstreams, e.g., SVC or MVC bitstreams. A commonly employed technology for the transport of media over Internet Protocol (IP) networks is known as Real-time Transport Protocol (RTP). In layered multicast using RTP, a layer or a subset of the layers of a scalable bitstream is transported in its own RTP session, where each RTP session belongs to a multicast group. Receivers can join or subscribe to desired RTP sessions or multicast groups to receive the bitstream of certain layers. Conventional RTP and layered multicast is described, e.g., in H. Schulzrinne, S. Casner, S., R. Frederick, and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications”, IETF STD 64, RFC 3550, July 2003, available from the Internet Engineering Task Force (IETF) webpage and S. McCanne, V. Jacobson, and M. Vetterli, “Receiver-driven layered multicast” in Proc. of ACM SIGCOMM'96, pp. 117-130, Stanford, Calif., August 1996.
The H.264/AVC RTP payload format is specified in RFC 3984, available from http://www.ietf.org/rfc/rfc3984.txt. RFC 3984 specifies three packetization modes: single network abstraction layer (NAL) unit packetization mode; non-interleaved packetization mode; and interleaved packetization mode. In the interleaved packetization mode, each NAL unit included in a packet is associated with a decoding order number (DON)-related field such that the NAL unit decoding order can be derived. Alternatively, no DON-related fields are available when the single NAL unit packetization mode or the non-interleaved packetization mode is used. A recent draft of the SVC RTP payload format is available from IETF webpage. In this recent draft, a payload content scalability information (PACSI) NAL unit is specified to contain scalability information, among other types of information, for NAL units included in the RTP packet.
In layered multicast, a receiver that subscribes to more than one RTP session recovers the decoding order of the received NAL units from different RTP sessions before passing them to a decoder. However, complications in NAL unit decoding order recovery arise due to session initiation variation between different RTP sessions, the use of the interleaved packetization mode as specified in RFC 3984 within one or more RTP sessions, and the NAL unit decoding order being different from the output or display order.
The recent draft of the SVC RTP payload format attempts to ensure that the DON over the entire SVC bitstream, referred to as cross-layer DON (CL-DON), can be derived for each NAL unit by requiring the use of the interleaved packetization mode for all the RTP sessions. Additionally, the recent draft further requires that the DON-related fields are derived based on CL-DON. However, some currently existing RFC 3984-type receivers do not have the interleaved packetization mode implemented therein. Therefore, these receivers are not able to join a layered multicast and receive service.