Internet Protocol TV (IPTV) is an emerging technology that allows telecommunications service providers to deliver digital TV (DTV) and other services over the phone lines to subscribers' homes. There are many existing or proposed standards for broadcasting a DTV program to the home. In particular, there are numerous transport-layer encapsulation protocols either defined by existing standards, or being proposed for future standards.
As shown in FIG. 1, in system 100, video and audio are encoded (compressed), either in real-time or non real-time. The video encoders 102 and audio encoders 104 are generally synchronized to the same System Time Clock (STC) 106 (e.g., 27 MHz for MPEG-2 Systems). Samples of the STC are sent either in selected MPEG-2 Transport packets, where they are called Program Clock References (PCRs), or as a timestamp in a Network Time Protocol (NTP) IP packet. Time stamps (TS) locked to the STC are produced, but are generally sent with a coarser time resolution. For example, MPEG-2 Systems typically use a 90 kHz clock for time stamps, which is 300 times slower than the system clock. Time stamps are generally associated with one or more access units (coded video or audio frames).
The outputs of the encoders are referred to as elementary streams. For IPTV, a number of different encapsulation protocols exist:
One common encapsulation method is to multiplex the video and audio elementary streams (VES and AES) into an MPEG-2 transport stream (TS) using an MPEG-2 Transport Encoder. (MPEG-2 TE). PCRs and time stamps provide timing and synchronization information. An integral number of consecutive MPEG-2 TS packets (each 188 bytes long) are encapsulated into a real-time transport protocol (RTP) packet. Each RTP packet is encapsulated into a user datagram protocol (UDP) packet. Each UDP packet is in turn encapsulated into an IP packet.
Another encapsulation method is to bypass the RTP layer and encapsulate the MPEG-2 TS packets directly into UDP/IP.
Yet another encapsulation method is to bypass the MPEG-2 TS layer and encapsulate the audio/video elementary stream packets directly into RTP/UDP/IP. The NTP clock samples and RTP time stamps contain timing and synchronization information.
Whatever the encapsulation method, the IP packet stream then passes through the telecommunication company's digital subscriber line (DSL) access multiplexer (DSLAM) 110 where it may be mixed with other IP streams. The aggregated IP stream is sent over unshielded twisted pair to a subscriber's home using some version of xDSL (e.g., ADSL, VDSL2, etc.). The DSL signal is demodulated by a subscriber's DSL modem 112. The DSL modem may be integrated into an IPTV set-top box (STB) 114 or may be a separate unit. Inside the IPTV STB, the transport layers are de-encapsulated, and the VES/AES data and timing/synchronization information, via a timing recovery module 116, is sent to the video decoders 118 and audio decoders 120. The output of the video and audio decoders are attached to monitor(s) and speakers, respectively.
In unmanaged IP networks, IP packets can be lost, received out of order, delayed or received with jitter. Various technologies can be applied to combat these unwanted effects. For example, packets can be duplicated or made more robust with FEC to guard against lost packets. Sequence numbers in RTP headers can be used to re-order out-of-order or delayed packets into their correct order. Larger decoder buffers can be used to de-jitter packets. If precautions are not applied, lost IP packets can produce poor Quality of Service by inducing glitches into the decoded video, audio or both.
Another source of IP packet loss is at the DSLAM. If congestion occurs at the DSLAM, it will need to drop IP packets. Naïve packet dropping will produce the poor quality of service discussed above. If the video packets are prioritized, and if the DSLAM is responsive to this prioritization, it would be possible to improve the quality of service.
Denting, or packet dropping, is the action of dropping IP packets at the DSLAM. A DSLAM that incorporates “smart denting” looks at priority signals either in the packet headers or in the video payload and attempts to drop only low-priority video packets. Examples of low-priority video frames are MPEG-2 “B” pictures or H.264 “disposable B” pictures, as are known to those of skill in the art. If the video bitstream contains low-priority pictures, the DSLAM can preferentially drop these pictures so that error propagation at the decoder is eliminated or greatly reduced. This will increase the video quality of service.
As described herein, it is assumed that the DSLAM can only respond to congestion by dropping units of IP packets. If the IP packets do not contain an MPEG-2 Transport layer, and if one or more video frames (in coding order) are encapsulated in RTP/UDP/IP or UDP/IP packets, then it can be relatively straightforward for the DSLAM to drop low-priority video frames, since there is a direct mapping of video frames to IP packets. However, if an MPEG-2 Transport layer is present, there is currently no simple or natural mapping of video frames to IP packets.
Thus, there is a need for an improved systems and methods for digital stream denting to provide for a simple mapping of video frames to IP packets.