Transmission of media information such as audio signal, video signals, still images etc. is typically based on packetization i.e. information to be transmitted is framed into packets. The packets are then transmitted as one or more packet streams. The packet streams can be transmitted e.g. as RTP (Real Time Protocol) packets or as packets of an other protocol which are not necessarily RTP packets. It may happen that some of the packets get lost for some reason during the transmission. For example, the transmission channel may be affected by disturbances which may weaken the signals which carry the packet stream and cause losses to the packet stream. In the transmission stage it is possible to add some error recovery information to the packet stream which can be used at the receiving stage to recover lost packets. One known method is based on forward error correction (FEC) in which extra packets carrying error recovery information are inserted to the packet stream. Such extra packets are called as repair packets in this description i.e. packets carrying error recovery information. The repair packets are formed on the basis of those packets which should be protected by the repair packets. For example, a bitwise XOR operation is performed on data of the packets and the calculated FEC values are packetized to form the repair packets. The repair packets and the packets which are used in forming the repair packets form an FEC block. Therefore, the packet losses of any packets included in an FEC block may be recoverable by using the information of the repair packets of the same FEC block.
The packetization can also be applied on different layers of the so called protocol stacks. The well-known OSI model describes a seven-layer structure, in which the physical layer is at the bottom and the application layer is at the top of the protocol stack. The other layers of the OSI model are data link layer above the physical layer, network layer, transport layer, session layer, and presentation layer below the application layer. The RTP packets can be regarded as packets of the application layer (which, hence, can be called as an “RTP layer” in this case).
Many video communication systems provide controlling means for data transmission rate and buffering. In one-to-one systems, the recipient can send its buffer occupancy status to the originator, which can then tune the transmission rate accordingly (e.g. rate adaptation in 3GPP packet-switched streaming). In unidirectional systems, transmitted streams typically have to comply with a known buffering model of the receiver. Examples of buffering systems including specified recipient buffering models include MPEG-2 Systems, Annex G of 3GPP packet-switched streaming (3GPP Technical Specification 26.234), and 3GPP multimedia broadcast/multicast system (MBMS, 3GPP Technical Specification 26.346). Senders have to ensure that the transmitted streams comply with the buffering model, i.e. do not cause underflows or overflows of the buffer. Receivers should be capable of receiving valid streams and can use the buffer size of the hypothetical buffer model for allocation of the actual buffer.
A simplified system for IP datacasting over DVB-H is described in FIG. 2 as a block diagram. Content servers provide multimedia content over IP network to IP encapsulators. IP encapsulator encapsulates the IP streams on top of MPEG-2 transport streams, which are conveyed over a DVB-H network to receiving terminals.
In FIG. 7 an example of media transmission in a DVB network 701 is depicted. In DVB systems, a multiplex 702 is a set of DVB services 703 multiplexed together and carried on one transport stream. Transport streams of different multiplexes 702 can be transmitted in the DVB network 701. The components of a DVB service (e.g. video component, audio component, text component) are included as elementary streams 704 each carrying data of one of the components of one of the DVB services 703. The components of the DVB services may be encapsulated as IP (Internet Protocol) streams 705 containing IP datagrams.
For DVB systems, the so called multiprotocol encapsulation (MPE) has been introduced. The MPE is intended for encoding network layer (OSI-model layer 3) datagrams (IP packets) into transport streams. Each IP datagram is typically encoded into a single MPE section. Single elementary stream may contain multiple MPE section streams. An elementary stream carrying MPE section may also carry error correction data i.e. MPE FEC sections for supporting error correction for data packets on MPE section payloads. MPE sections can be regarded as packets of the data link layer of the OSI protocol stack
Hypothetical reference decoder (HRD) in some video coding standards is used to verify that produced bitstreams are standard-compliant and that decoders produce standard-compliant output. Standard-compliant decoders are required to be capable of inputting streams that are compliant to the HRD. The HRD is used to prevent “adverse” bitstreams, i.e. it constraints the resource consumption in the decoders, both in terms of memory usage and computational complexity. The input to the HRD is constant bitrate or a piecewise function of 0 and constant bitrate. The HRD is also used to allow video bitrate fluctuation, which enables achieving of nearly constant picture rate and quality.
When the media streams are sent in a multiplexed manner, the output of the hypothetical demultiplexer must be compatible with the input requirements for the hypothetical media decoder. Otherwise, compatibility to the media decoder buffer model cannot be guaranteed.
FEC decoding of an MPE FEC frame requires initial buffering (from the reception of the first packet for the MPE FEC frame until the start of media decoding) in the receiving terminal, because if the receiver started to decode source RTP packets (i.e. media RTP packets) immediately when the first one is received, any lost source RTP packet would cause a delay in decoding until the repair columns of the MPE FEC frame are received. This would consequently cause a pause in the playback.
Furthermore, as explained in the following, pause less playback may require additional initial buffering beyond the reception of the first MPE FEC frame. Let tai(n) be the reception time of the first bit of an MPE FEC frame of index n in transmission order, and let taf(n) be the reception time of the last bit of the MPE FEC frame. Furthermore, let b(n) be the number of bits in the RTP payloads of a media stream within MPE FEC frame n, and r(n) be the bitrate of the media stream (that is used for verification of HRD compliancy). If, for all values of n, b(n)/r(n)=taf(n+1)−taf(n), then initial buffering duration would be always 0. However, this is will not be the case due to some of the following reasons:
First, puncturing (number of “media” columns per MPE FEC frame), FEC code rate (number of FEC columns per MPE FEC frame), and amount of padding may vary.
Second, scheduling of time-slicing bursts may not be as accurate as required in the formula above, but it is likely to follow average bitrates of the stream and the time-slicing burst interval derived from the average bitrate.
Third, an elementary stream and a time slice may contain packets from multiple IP streams. Meeting an accurate bit budget for each IP stream within a time slice is a challenging target for varying-bitrate media such as video.
As a summary, initial buffering of one MPE FEC frame entirely is not a sufficient condition to guarantee pauseless decoding and playback. Therefore, senders must give receivers information that allows sufficient but not exhaustive amount of initial buffering.
Clause 13 of ETSI EN 301 192 v1.4.1 specifies the decoder model for DVB data broadcasting. The model consists of a transport buffer and an optional main buffer. The transport buffer is a small (512-byte) buffer to remove duplicates of MPEG-2 TS packets. The main buffer is used to smooth the bitrate to be suitable for media decoders. The operation of the main buffer can be controlled by specifying the output byte rate in the smoothing_buffer_descriptor syntax structure of MPEG-2 systems. However, there is no mechanism to signal and apply an initial buffering delay in the main buffer, and therefore the main buffer is unsuitable to be used in combination with MPE FEC decoding.
As the DVB-H IP datacasting is a multicast/broadcast service, new receivers may “tune in” in the middle of the stream i.e. new receivers may begin to receive the stream later than the first packet of the stream was transmitted. The optimal (minimum) initial buffering delay is usually not constant throughout the stream.
Size of FEC Decoding Buffer
As was shown above, received packets for an MPE FEC frame have to be buffered before the decoding of the FEC packets can be started. Such a buffer is called as an FEC decoding buffer in this description. The buffer occupancy level of the FEC decoding buffer depends inter alia on 1) the transmission schedule of the elementary stream, 2) the amount of initial buffering before starting the emptying of the buffer, 3) the method of building an FEC matrix inside the FEC decoding buffer, and 4) the output rate of data from the FEC decoding buffer. The maximum buffer occupancy level determines the required buffer size for the stream. It is evident that in different receiving device implementations the FEC decoding and the related buffering may be implemented differently compared to each other. For example, players may have a different approach for output rate handling—one device may push data out from the FEC decoding buffer as soon as the buffers “downstream” (e.g. decoder input buffers) allow, and another device may pull data out from the FEC decoding buffer just on time when the next piece of data is needed for decoding. Therefore, the maximum buffer occupancy level may vary in different implementations, and consequently it would be problematic to determine the required FEC decoding buffer size of a particular stream without a hypothetical buffer model.
Encoders and transmitters should also be aware of the supported FEC decoding buffer size of all receivers when performing FEC encoding and transmission scheduling.