In unidirectional broadcast systems data is transmitted from a sender to one or more recipients using the push method. In broadcast systems data is transmitted especially in the streaming method as continuous data streams, which has the advantage of enabling high demands on the accuracy of the data rate to be fulfilled, if for example the data rate is coupled to the system clock of the recipient. Thus audio and video streams are nowadays generally streamed to the recipient, with the data being provided in each case with time marks which specify the point in time at which it is relevant for presentation and/or decoding. The disadvantage of the streaming method is that missing or incorrectly received data cannot be transmitted to the recipient again.
Also known in broadcast systems is data transmission by the download method, in which data is transmitted from the sender to the recipient in the form of data files or data objects and is stored in the recipient. Currently only additional data which relates to an audio and video stream of a same data transmission session, such as electronic program information (EPG=Electronic Program Guide) and the like is transmitted to the recipients using the download method. The reason for this lies in the fact that this type of additional data is not time-critical, so that it is not necessary to synchronize this additional data for a presentation with the audio and video streams transmitted in the streaming method.
This situation has changed fundamentally however in the area of so-called rich media applications, in which graphical scenes are described. Each graphical scene, which can be valid for a longer period, is made up in such applications from audio, video and scene data (graphics and text data) (see for example MPEG-Standard “LASeR” (Lightweight Application Scene Representation), previously known as MPEG-4: Part 20, or ISO/IEC 14496-20, in which a format for description of graphical scenes is specified). Since the status of a scene described by the scene data is time-critical, it is necessary to synchronize the scene data with the audio and video streams.
If the data is transmitted between sender and recipient via point-to-point-connections, the scene data and the media data (audio and video data) can be streamed in parallel to the recipients so that they are available to the recipient at the beginning of the period of time in which they are valid.
A temporal synchronization of the states of a scene with the media streams is enabled by protection mechanisms of the stream synchronization, based on RTP (Real Time Transport Protocol) for example, a protocol for continuous transmission of audio-visual data (streams) over IP-based networks, which is standardized in RFC 3550 (RFC=Request for Comments). In RTP so-called Sender Reports are sent in parallel to the individual media streams, the RTP time marks in the media streams are assigned to NTP time marks (NTP=Network Time Protocol), with the NTP-time marks being unique in all Sender Reports of the different media streams. NTP involves a standard for synchronization of clocks in computer systems over packet-based communication networks which is standardized as RFC 958. The time mark format of NTP is used in the Sender Reports.
Since however it cannot be ensured in the broadcast method that a recipient is already receiving the stream at the beginning of the period of time in which a respective scene is valid, it is necessary, at least during the period of time in which a scene is valid, to keep transmitting the scene data so that a recipient who only connects later can also receive the scene data. A synchronization of scene states based on a synchronization of streams is however not possible in this case. Furthermore advance reception of scene data, which is either complex and thus to be processed at an early stage or may need to be used in other scenes, is not possible.
One approach to solving these problems is by an application referred to as “HisTV”, in which a synchronization of scene states is implemented by NTP time marks of the audio or video stream being referenced in the description of the states of a scene. The disadvantage in this case is that it must be known in advance which time marks will be used in the media flows, which is not possible however, especially with live recorded programs. But also with already recorded broadcasts this restricts management of the content at the head end, for example during switching of promotional entries.