In broadcast TV, the audio and video flows are traditionally broadcast together. They are generally provided by a single multimedia source, for example a supplier of multimedia contents, then transported by a single transport protocol over a given transport network then delivered to a single end user device, for example a decoder or a television, in charge of reading these flows, displaying the video data on a screen and broadcasting the audio data on a loudspeaker.
With the rapid development of the Internet network and mobile telecommunication networks new multimedia applications have appeared in which the sources and/or the transport protocols can be different for the audio flows and the video flows. Interactive applications have also appeared for which the sources and transport protocols can also be different from those of the audio-video contents to which they refer. These applications can in this way be transported through broadband networks.
For these new applications, it is necessary to make sure that the rendering of the audio flow is synchronous with the rendering of the video flow, or that the interactive application is synchronously rendered with the audio-video flow.
An example of new multimedia application is the generation of an audio flow by a source different from that of the video flow, this audio flow being intended to substitute itself for a basic audio flow which would be provided with the video flow. For example, in the case of a football match broadcast on the television, it is possible to substitute for the basic audio flow provided with the video flow of the match an audio flow comprising the example comments in a language other than that of the basic audio flow which would be delivered by another multimedia supplier than the match broadcaster. In order that the audio flow can be synchronized with the video flow, the said flows must contain common or equivalent timing references. As a general rule, the transport protocol provides these references or timestamps to the rendering device so that it regulates and synchronizes the rendering of the two flows.
The timestamp is generally a counter value which indicates the time during which the event associated with this timestamp occurs. The clock frequency of the counter must be a value known by the rendering device so that it correctly regulates the flow rendering. The manner in which this clock frequency is given to the rendering device is described in the specifications of the transport layers (MPEG-TS, RTP, etc.).
In order that the rendering device can synchronize the two flows, the latter generally refer to a common clock commonly called “wall clock”. For example, in the case of the RTP protocol (for Real-Time Transport Protocol) specified by the IETF according to RFC 3550, a transmitter periodically transmits a message called RTCP (for Real-time Transport Control Protocol) broadcast report indicating the equivalence between the timestamp and the time given by the common clock. If the audio and video flows are provided by different sources, these two sources must share the same common clock. The NTP protocol (for Network Time Protocol) is typically used to synchronize the two sources on the same clock.
However, when the two sources are not connected by a sufficiently reliable network in terms of transport time, another synchronization mechanism is then necessary.
This synchronization problem can also exist between two video flows which are displayed on a single rendering device, one of the video contents being displayed by picture in picture in the other one, when the two flows are not provided by the same source or the same transport protocol. The Picture in Picture function is an example of this. Another example concerns cases of 2D to 3D transition where a 2D video is received in the broadcast flow and the 3D complement enabling a 3D display is received in the broadband flow.