With the convergence of broadcasting and communications, multimedia services through diverse media are becoming common. That is, services using conventional broadcast media including ground waves, a satellite, cable, and the like have become diversified based on a digital method, simultaneously with the introduction of mobile broadcasting services, such as Digital Multimedia Broadcasting (DMB), Digital Video Broadcasting-Handheld (DVB-H), and Advanced Television Systems Committee-Mobile/Handheld (ATSC-M/H), and hybrid services including Internet Protocol TV (IPTV). Particularly, digital broadcasting not only provides programs having image quality that is dozens of times higher than that of conventional analog broadcasting and CD-level sound quality and provides an increasing number of channels, allowing a user a wide range of options, but also offers new interactive services including home shopping, home banking, electronic mail, and Internet services, thereby providing higher quality broadcasting services than the conventional broadcasting.
Digital broadcasting service quality is divided into video quality (VQ) that is image quality associated with video content itself, such as the quality, resolution, and color representation of a screen, video quality of service (V-QoS) that is image quality associated with a process of transmitting multimedia data through a network from a service provider to an end user, and quality of experience (QoE) that is service quality generally experienced by an end user including not only video content but also the reactivity, interrelationship, usefulness, and ambient conditions of a service. Particularly, channel zapping time, which is the duration of time taken for the user to select a channel and to play a broadcast image on the channel, is used as a main indicator for QoE measurement. To play a broadcast image, not only compressed video and audio data but also synchronization information relating to a time to play these data on a screen via decoding is necessary. In a conventional art, synchronization information is transmitted, being included in a data unit that transmits audio or video data according to a Real-Time Transport Protocol/Real-Time Control Protocol (RTP/RTCP), or a data stream, which is separate from an audio/video stream, is assigned to transmit synchronization information according to an MPEG Media Transport (MMT). Generally, data transmission is performed through a protocol stack including various layers. Thus, to acquire information needed to play a broadcast image, there is needed a process of extracting, from received original data, encapsulated data starting from the lowest physical layer to a specific layer of a protocol stack where compressed video/audio data and synchronization information are transmitted, and time required for this process ultimately affects channel zapping time.
FIG. 1 is a diagram of a general data transfer protocol stack for a digital broadcasting service. Although FIG. 1 illustrates a protocol stack including four layers 110 to 140 as an example, a protocol stack having a structure of further subdivided layers may be employed as necessary.
Referring to FIG. 1, compressed audio/video data and synchronization information on an image are encapsulated in a data unit used for each layer as passing through the layer of the data transfer protocol stack. For example, video data compressed in an application/presentation/session layer 110 of a transmitting device is encapsulated in a payload of a data unit used for a network/transport layer 120 and is transmitted to the next layers, and a data link layer 130 stores data, which is transmitted from the higher layer, in a payload in a data unit thereof and transmits the data to the next layer. This process is repeated until the data is transmitted to a physical layer 140, which is the lowest layer, and a data unit generated in the physical layer 140 is transmitted to a receiving device through a transmission medium.
The receiving device extracts real data in reverse order of the process of the transmitting device. That is, a physical layer 140 extracts the data included in a payload of the data unit received through the transmission medium and transmits the data to a higher layer, which is a data link layer 130, and the data link layer 130 analyzes the transmitted data to extract a data unit used for a network/transport layer 120 and transmits the data unit to a network/transport layer 120. This process is repeated until an application/presentation/session layer 110 as the highest layer, and the application/presentation/session layer 110 ultimately extracts the compressed video/audio data and synchronization information to play an image on a screen.
As described above, the receiving device decodes the received audio/video data and determines a time to play a decoded image based on the relevant synchronization information. According to a conventional art, synchronization information is transmitted through a data unit generated for transmitting compressed video or audio information, as in FIG. 1, or is transmitted through a separate data stream from a data stream for the transmission of audio/video data.
FIGS. 2 and 3 illustrate an example of transmitting synchronization information through a separate data stream from a data stream for transmission of audio/video data.
Referring to FIGS. 2 and 3, the synchronization information refers to audio and video data to be played on a screen at a corresponding time, in which first audio data Audio data-1 and first video data Video data-1 need to be played at T1, and second audio data Audio data-2 and second video data Video data-2 need to be played at T3 and T2, respectively. Also, third audio data Audio data-3 and third video data Video data-3 need to be played at T5 and T3, respectively, and fourth video data Video data-4 and fifth video data Video data-5 need to be played at T4 and T5, respectively. As such, synchronization information is generally transmitted through the same layer (Layer X) as a protocol layer that transmits audio and video data, in which a network/transport layer or a higher layer is used.