The present invention relates to the field of digital broadcasting, and in particular to broadcasting with an MPEG2 Transport Stream (TS) used for transmitting digital audio and video over lossy networks.
MPEG2 is widely used as the format of digital television signals that are broadcast by terrestrial (over-the-air), cable, internet, Over The Top (OTT) or direct broadcast satellite TV systems. The MPEG2 TS is a standard format for transmission and storage of audio, video, and Program and System Information Protocol (PSIP) data, and is used in broadcast systems, such as DVB (Digital Video Broadcasting) and ATSC (Advanced Television Systems Committee).
The TS is specified in MPEG-2 Part 1, Systems. A TS specifies a container format encapsulating Packetized Elementary Streams (PES), with error correction and stream synchronization features for maintaining transmission integrity when the signal is degraded. Multiple MPEG programs are combined in a transmitting stream packet and then sent to a transmitting antenna. On a receiving side, a transmitting stream packet is received, decoded and displayed on a display device. This will now be explained with reference to FIG. 1.
FIG. 1 illustrates a conventional communication system 100. As illustrated in the figure, conventional communication system 100 includes a transmitter 102 transmitting a transport stream 122 over a network 106, which is received by a receiver 104. Conventional communication system 100 may include a plurality of receivers; however, for illustration purposes only one receiver is shown.
Receiver 104 includes a tuner 108, a TS buffer 110, a decoder 112, a processor 114, a memory 116, a hard disk drive (HDD) 118 and a display interface unit 120. In this example, tuner 108, TS buffer 110, decoder 112, processor 114, memory 116, HDD 118 and display interface unit 120 are distinct elements. However, in some embodiments, at least two of tuner 108, TS buffer 110, decoder 112, processor 114, memory 116, HDD 118 and display interface unit 120 may be combined as a unitary element. Further, in some embodiments, at least one of tuner 108, TS buffer 110, decoder 112, processor 114, memory 116, HDD 118 and display interface unit 120 may be implemented as a tangible computer-readable media for carrying or having computer-executable instructions or data structures stored thereon. Such tangible computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer. Non-limiting examples of tangible computer-readable media include physical storage and/or memory media such as RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer, the computer properly views the connection as a tangible computer-readable medium. Thus, any such connection is properly termed a tangible computer-readable medium. Combinations of the above should also be included within the scope of tangible computer-readable media.
Tuner 108 is arranged to receive a data signal 124 and to output a signal 126 to TS buffer 110. Decoder 112 is arranged to receive a signal 128 from TS buffer 110. Processor 114 is arranged to bi-directionally communicate with tuner 108 via a signal 130; bi-directionally communicate with TS buffer 110 via a signal 132; bi-directionally communicate with decoder 112 via a signal 134; bi-directionally communicate with memory 116 via a signal 136; bi-directionally communicate with HDD 118 via a signal 138; and communicate with display interface unit 120 via a signal 140. Display interface unit 120 is arranged to output an output signal 142.
A TS packet is the basic unit of data in a transport stream. Tuner 108 is always receiving TS packets via data signal 124. Ideally, receiver 104 can be tuned to a channel at any moment of time. Once tuner 108 detects a TS packet, it forwards it to TS buffer 110 via a signal 126. TS packets are fed in to decoder 112 via a signal 128 for decoding the multimedia data. In case, when the multimedia data was compressed and modulated before transmitting it over network 106, tuner 108 may include a de-compressor/demodulator to recover the data before feeding it to TS buffer 110.
Processor 114 provides all the controls on receiver 104. Processor 114 may include a digital signal processor (DSP) and/or a multimedia processor.
Decoder 112 may include an audio/video (AV) decoder. HDD 118 stores multimedia data. Memory 116 may include a program memory as well. Display 120 provides output signal 142 to display multimedia data on a display device, for example, a TV screen, a monitor, a game console or a mobile device.
In one example, transmitter 102 is a cable TV headend, which is transmitting multimedia data over cable network 106, and receiver 104 is a set top box (STB). In one example, STB is a Digital Video Recorder (DVR) that decodes, records and plays audio/video data received via a satellite dish or digital cable.
Transport stream 120 carries a plurality of TS packets over network 106. TS packets are normally 188 bytes in length, however, in some cases the length of the TS packets is modified. A standard format for a 188 byte TS packet will now be described with reference to FIGS. 2A-2E.
Referring now to FIGS. 2A-2E, a conventional TS packet 200 is illustrated. As depicted in FIG. 2A, TS packet 200 includes a header 202 and a payload 204, which together make up the 188 bytes. TS packet 200 is formed at a headend for transmitting multimedia data to a plurality of receivers. For example, TS packet 200 may be transmitted by transmitter 102. Header 202 includes the information about the data, whereas, payload 204 carries the actual data.
Header 202 is further described with the help of FIG. 2B. FIG. 2B illustrates an expanded form of header 202. Header 202 includes a synchronization byte 206, a transport error indicator 208 (one bit), a payload unit start indicator 210 (one bit), a transport priority 212 (one bit), a packet identifier (PID) 214, a transport scrambling control 216 (two bits), an adaptation field control 218 (two bits), a continuity counter 220 (4 bits) and an adaptation field 222.
Header 202 starts with synchronization byte (sync byte) 206, wherein a predetermined synchronization value is reserved for synchronization byte, which is hexadecimal forty seven (0x47 or 0100 0111). The one bit for transport error indicator 208 indicates that the TS packet has an uncorrectable error. The one bit for payload unit start indicator 210 indicates start of payload data. The one bit for transport priority 212 indicates higher priority than other packets with the same PID. PID 214 is assigned by the headend to uniquely identify the transport stream the packet belongs. PID 214 allows receiver 204 to differentiate the stream to which each received packet belongs.
The two bits for transport scrambling control 216 identifies whether even or odd scrambling was used. With the two bits for adaptation field control 218: a value of ‘01’ indicates no adaptation fields, but only payload exists; a value of ‘10’ indicates only adaptation field exists; and a value of ‘11’ indicates that both adaptation fields and payload exist.
The four bits for continuity counter 220 are incremented for every TS packet belonging to the same stream only when a payload is present.
If adaptation field control 218 is ‘10’ or ‘11’ then an adaptation field 222 is appended to header 202. If adaptation field control 218 is ‘01’, then only payload 204 follows header 202. Adaptation field 222 is illustrated to be ‘a’ bits in length, where ‘a’ is a variable.
FIG. 2C illustrates an expanded form of adaptation field 222. As illustrated in the figure, adaptation field 222 includes 8 bit adaptation field length 224, one bit discontinuity indicator 226, one bit random access indicator 228, one bit elementary stream priority indicator 230, five bit flags 232, ‘b’ bits optional fields 234 and ‘c’ bits stuffing bytes 236, where ‘b’ and ‘c’ are variables.
FIG. 2D illustrates an expanded form of optional fields 234. As illustrated in the figure, optional fields 234 include a forty two bits Program Clock Reference (PCR) 238, a forty two bits Original Program Clock Reference (OPCR) 240, eight bits splice countdown 242, eight bits transport private data length 244, transport private data 246, eight bits adaptation field extension length 248, three bit flags 250, and ‘d’ bits optional fields 252, where ‘d’ is a variable.
FIG. 2E illustrates an expanded form of optional fields 252. As illustrated in the figure, optional fields 252 include one bit ltw_valid flag 254, fifteen bit ltw offset 256, two reserved bits 258, twenty two bits piecewise rate field 260, four bit splice type 262, thirty three bits DTS_next_au field 264.
In most cases, multimedia data that is transported over a lossy network is packed in to the 188 bytes of a TS packet. However, for some applications, a four byte timestamp is prefixed to each 188 byte TS packet while recording, thus resulting in a 192 byte TS packet. This is shown below with the help of FIG. 3.
FIG. 3 illustrates a conventional 192 byte TS packet. As illustrated in the figure, TS packet 300 includes a 4 bytes time stamp 302, in addition to header 202 and payload 204, thus making TS packet 300 to be 192 bytes long. Time stamp 302 is known as “pacing time” and is used for keeping track of the real time the data was recorded, in order to play back the recorded data at the same rate it was received.
For the purposes of discussion, the term TS packet will be used hereon in this specification for TS packet 200 or TS packet 300 accordingly.
The beginning of each TS packet received by receiver 104 is identified by sync byte 206, which is assigned a fixed unique number. Tuner 108 is always receiving TS packets. Once tuner 108 identifies sync byte 206 in a TS packet, the tuner locks on to the sync byte and forwards the TS packet to TS buffer 110. Decoder 112 decodes the TS packets stored in TS buffer 110 to separate out the header and the multimedia data. Every subsequent 188 byte or 192 byte (depending on the format) should exactly match with sync byte 206.
Ideally, receiver 104 can be tuned to a channel at any moment of time. Depending on the real time when tuner 108 starts to look for sync byte 206, in some cases, the tuner may receive only a portion of a TS packet for the first TS packet, i.e., real sync byte 206 may have already passed and the tuner detects the TS packet starting from the PID or the payload. In case of a 192 byte TS packet, the TS packet detected by the tuner may start from the middle of the timestamp.
In some cases, the value of sync byte 206 may be emulated somewhere else in the TS packet. In other words, a fake sync byte may be present in addition to the real sync byte. Since the tuner 108 is looking for a unique value for the sync byte in the TS packet to detect the beginning of the TS packet, if another continuous combination of the bits in the TS packet is the same as that unique number (on a byte boundary), tuner 108 may misinterpret that other continuous combination of bits as a sync byte. For example, when the unique number is expected to be 0x47, any byte in the TS packet that contains 0x47 may be interpreted as a sync byte, in case the real sync byte was missed.
Emulation of sync byte 206 in the TS packet may be present under certain conditions, which are discussed below.
In one scenario, PID 214 assigned by the headend may contain the unique number reserved for sync byte 206. PID 214 is part of each TS packet received by receiver 104, therefore, the emulation continues throughout the transport stream.
In another scenario, 4 bytes of timestamp 302 for TS packet 300 can also emulate sync byte 206 if it carries the unique number reserved for sync byte 206. For 188 bytes TS packets, this scenario does not exist.
In another scenario, any byte of payload 204 can match with the unique number reserved for sync byte 206, therefore, resulting in an emulated sync byte. Furthermore, payload 204 can also have emulated sync byte repeated in two consecutive TS packets, which may be duplicate packets due to accidental duplication or corruption.
In one scenario, second byte of header 202 can also emulate sync byte 206. For very low bit rate audio coding, wherein each audio frame is put in to one PES packet and each audio PES packet is fit in to one TS packet, for each TS packet, payload unit start indicator 210 is set to one, transport priority 212 is zero and 13 bit PID 214 is 0x7xx, thus resulting in 0x47 as the second byte of header 202. If PID 214 is 0x747 instead of 0x7xx then the TS packet will have three successive sync bytes at the start of header 202.
Even if second byte of header 202 is not one, there is still possibility of PID 214 along with time stamp 302 and payload 204 being 0x47, therefore, resulting in three emulated sync bytes in one TS packet.
As discussed above, different scenarios can cause receiver 104 to detect an emulated (fake) sync byte instead of the real sync byte 206. Once tuner 108 locks on to an emulated sync byte, it forwards the TS packet to decoder 112 for further decoding. Decoder 112 strips off the multimedia data out of the TS packet based on the emulated sync byte, which may be located either before or after the real sync byte. Wrong interpretation of the data causes the system to get unstable, thereby, resulting in poor end user experience.
In one example, different scenarios discussed above also have an impact on DVR trick play (pause and play) sessions. Once a session is paused and played back, during the playback of the recorded content, the TS packet data is retrieved from HDD 118 and routed to decoder 112 for further decoding. If the sync byte detection fails to detect the correct sync byte in the TS packet then the wrong payload data is routed to AV decoders. AV decoders decode the wrong AV data throughout the session resulting in the DVR failure.
What is needed is a system and method for detecting TS sync bytes in the TS packets with multiple emulated sync bytes.