When streaming multimedia content in an error-prone network environment (e.g., a wireless network), using unicast, multicast or broadcast, Forward Error Correction (FEC) is often used to recover lost packets due to transmission error/loss. In order to recover burst packet loss, Forward Error Correction works on a number of packets. This involves buffering the packets before FEC redundant packets can be generated. These redundant packets, due to buffering, are generated with a delay relative to the first protected data packet. To recover time-constrained data packets, these delayed redundant packets must also be received in time. The delay is dependent upon the amount of buffering. A multimedia content usually includes several synchronized streams and the streaming server initially sends out synchronized streams regardless of whether Forward Error Correction is used or not. If the buffering is not chosen properly for these streams, then different delays of the Forward Error Correction packets will be introduced, resulting in different Forward Error Correction decoding delays for different streams. This further causes unsynchronized media decoding.
To illustrate this problem, consider a streaming session with two synchronized streams, one for video, and the other for audio. Suppose the packet rate is Pv packets per second for video and Pa packets per second for audio. To calculate the Forward Error Correction to be used, we use Kv packets buffered for video data and Ka packets buffered for audio data. In order to minimize the delay in the Forward Error Correction calculation, for each packet, we make a copy for calculation and immediately send it out. When we have enough data, i.e., Kv packets for video and Ka packets for audio, we start generating redundant packets. Thus, the minimum delays from transmission of the first data packets and transmission of the first redundant packets are as follows:Dv=Kv/Pv Da=Ka/Pa 
Typically, there is a lower data rate for audio than for video, Pv>Pa. If we use the same amount of buffer for both video and audio, i.e., Kv=Ka, the latency of a redundant packet for audio will be larger than for video, Da>Dv. If the conditions of the network over which the streams are transmitted are good and there is no packet loss, redundant packets are not used. Both video and audio packets can still be processed in synchronization as if there were no Forward Error Correction. If there is some packet loss, the recovery process requires timely arrival of some redundant packets. On the client side, suppose the buffer time for Forward Error Correction decoding is T, which is the same for both audio and video. For Forward Error Correction decoding, T should be larger than at least one block of media data. If the video data rate is much greater than the audio data rate (an example being in the case of high definition content), then Dv<<Da. If the client chooses a buffer time T such that Dv<T<Da, then audio data recovery will fall behind video data recovery, resulting in either delayed audio decoding or a lossy audio stream.
One possible solution is to tune the buffer size for both video and audio to make Dv and Da roughly equal. However, if the content is encoded using VBR (variable bit rate), it is difficult to specify the correct buffer size at the beginning.
Previously, the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-2 (MPEG-2) standard Transport Stream put multiple elementary streams in the same MPEG-2 transport multiplex. Through the use of a Program Identifier (PID), each packet is associated with a Packetized Elementary Stream (PES). Thus, synchronization is done within the same stream. However, no solution has been found to solve this synchronization problem when multiple, separate streams are used in a streaming session in which Forward Error Correction is enabled.