Recently, with progresses of digitization of data and compression technique, applications of digital images and digital sounds have been developed in broadcasting, CATV or the like. Merits of using digitized data in broadcasting are as follows. i) Since various data including video, sounds, characters or the like may be handled collectedly, integration of services can be provided. ii) By utilizing compression technique in transmitting/receiving data, a great deal of broadcasting of high quality can be performed utilizing a limited transmission bandwidth. iii) Uniform services can be provided using an error correcting technique. iv) High techniques may be utilized with ease as an encryption technique for limited receive.
Packetization is generally employed for transmitting/receiving digitized data or compressed/encoded data. A packet represents an unit of data with entire data divided into a given size. Transmitting/receiving data as a packet allows high efficiency and precision in data communications. For example, in the case of performing exchange of packets in a computer network, respective separate packets are sent to transfer destination at varied timings through a network and reconstructed into original data in the transfer destination, so that information as to the transfer destination, transmission origin or order of packets is added to respective packets.
In the case of handling digital data, use of a packetization technique allows various data such as video, sounds, and character information•additional information to be packetized and combined to make multiplexed data, which is transmitted/received as a transport stream (TS) for transmission. Accordingly, both the compression technique and a multiplexing method of data are important in transmitting the data.
International standards of the multiplexing method of data includes MPEG2 (ISO/IEC 13818-1: Information technology-Generic coding of moving pictures and associated audio: Systems”, 1996.4). A description is given of production of multiplexed data according to the MPEG 2 standard and reproduction of the produced multiplexed data in the prior art.
FIGS. 19(a) to 19(c) and 20(a) to 20(e) are diagrams for explaining multiplexed data which is produced according to the MPEG 2 in the prior art, wherein FIG. 19(a) to 19(c) illustrate TSs for use in transmission of digitized data and FIG. 20(a) to 20(e) illustrate packets constituting TSs. A description is given of production of multiplexed data in the prior art with reference to FIGS. 19(a) to 19(c) and FIGS. 20(a) to 20(e).
Video data is compressed/encoded for each frame corresponding to a screen and audio data is compressed/encoded every given sample number such as 1024, and one or a plurality of frames are collected into packets which are refereed to as PES (packetized elementary stream) packets. It should be noted that the given sample number of the audio data represents a frame in the MPEG. FIGS. 20(a) to 20(c) schematically show formats of TS packets as packets constituting TSs. The PES packet includes a header, and the header includes types of subsequent data areas, i.e., a stream ID indicating video data, audio data or the other data, DTS (decoding time stamp) and PTS (presentation time stamp) as time information for synchronization video with audio to be reproduced. The PES packet is divided into a plurality of TS packets of 188 byte length, respectively, to be transmitted or to be stored.
FIGS. 20(d) and 20(e) schematically illustrate formats of TS packets comprising an adaptation field in which various information is included. As shown in figures, the adaptation field includes PCR (program clock reference). The PCR has a time base for encoding data such as video data or audio data, and has the same time base as PTS and DTS.
FIGS. 19(a) to 19(c) schematically show formats of TS packets. The TS packets have numbers inherent in packets, respectively, which are called PIDs (packet identifiers). The same PES packet has the same PID. The TS packets comprise a header, an adaptation field or data part subsequent to the header as shown in FIG. 19(b). The PID of the TS packet is given as a part of the header as shown in FIG. 19(c).
In FIGS. 19(a) to 19(c), a data region of the TS packet may include information as to program selection which is called PSI (program specific information) other than the PES. In the PSI, a number of a program and the PID of the TS packet including video data PES, audio data PES and data PES packets are described. Multiplexed data of a specific program is decoded and reproduced to obtain original images, referring to the PIS.
According to the prior art, the PES packet or the TS packet is produced by adding various information to various data of video data or audio data, resulting in a TS, which is recorded and stored or transmitted.
A description is given of a prior art multiplexed data reproducing apparatus wherein data multiplexed in the MPEG 2 data multiplexing system is decoded/reproduced FIG. 21 is a block diagram illustrating the prior art reproducing apparatus. In the figure, a separating means 2101 is for separating required portion from multiplexed data for each packet, comprising a first buffer 2111 and a CPU 2112. A control means 2102 is for controlling of decoding, comprising a second buffer 2121 and a CPU 2122. A video decoder 2105 is for decoding video data. An audio decoder 2106 is for decoding audio data.
FIG. 22 is a flowchart illustrating a procedure of control of the control means 2102. A description is given of an operation of the prior art MPEG 2 multiplexed data reproducing apparatus constructed above.
As shown in FIG. 21, multiplexed data is input from a recording medium 2107 or a transmission medium 2108 to the buffer 2111, the multiplexed data being stored temporarily therein. The CPU 2112 extracts video PES and audio PES corresponding to a desired program number based on a correspondence between a program and a PID which is obtained from separated PSI and outputs the video PES and the audio PES to a video decoder 2105 and an audio decoder 2106, respectively.
Each decoder performs decoding processing, directed by the control means 2102. FIG. 22 is a flowchart illustrating a processing procedure of control of the control means 2102. A description is given of control procedure of the control means 2102, following a flow in FIG. 22. In step 2201, an STC (system time clock) is obtained as a time base of the decoding apparatus on the basis of PCR in the TS packet. By obtaining the STC, a time base of reproducing apparatus matches a time base of an encoding apparatus. In step 2202, the video decoder 2105 performs decoding to obtain PTS or DTS. Similarly in step 2203, the audio decoder 2106 performs decoding to obtain DTS.
In step 2204, it is decided that whether the video decoder 2105 has started decoding or not and whether the PTS or the DTS which is obtained in step 2202 matches the STC. When it is decided that the video decoder 2105 has not started processing and the PTS or the DTS matches the STC, step 2206 is performed and the video decoder 2105 starts decoding. Similarly, in step 2205, it is decided whether the audio decoder 2106 has started decoding or not and whether the DTS which is obtained in step 2203 matches the STC or not. When it is decided that the audio decoder 2106 has not started processing and the DTS matches the STC, step 2207 is performed and the audio decoder 2106 starts decoding. For decision in step 2204, both PTS and DTS are not used for comparison. The video decoder 2105 and the audio decoder 2106 have the same time base under the control described above, so that synchronization and reproduction are performed, to be displayed.
A description is given of a prior art multiplexed data reproducing apparatus according to the standard MPEG2 (ISO/IEC 13818-1, “Information technology-Generic coding of moving pictures and associated audio information: Systems”, 1996.4), in terms of use of clock information.
FIG. 23 is a block diagram illustrating the prior art data reproducing apparatus according to MPEG 2. In the figure, a decoder 2301 is for decoding and reproducing compressed video data and audio data. A buffer 2302 is for storing the data temporarily. A clock extraction circuit 2303 is for extracting clock information from input multiplexed data. A synchronization clock generation circuit 2304 is for generating synchronization clock signals on the basis of input clock information. For example, a PLL (phase locked loop) circuit under a feedback control may be employed to generate the synchronization clock signals. FIG. 24 illustrates multiplexed digital data as an input of the data reproducing apparatus. In the figure, reference characters d11 to d15 designate compressed digital video data, reference characters d21 to d22 designate compressed digital audio data, and reference characters c1 and c2 designate clock information, which have been multiplexed. Clock information includes a value of clocks at 27 MHs of the apparatus which is counted using set modulo.
A description is given of the prior art data reproducing apparatus constructed above. When multiplexed data is input to the apparatus, the clock extraction circuit 2303 separates/extracts the clock information c1 and c2 and outputs the extracted c1 and c2 shown in FIG. 24 to the synchronization clock generation circuit 2304. Video data d11 to d15 and audio data d21 to d25 in FIG. 24 are output to the buffer 2302 and stored therein temporarily for decoding.
The synchronization clock generation circuit 2304 generates synchronized clock signals on the basis of input clock information and outputs the synchronized clock signals to the decoder 2301. The decoder 2301 decodes video data and audio data stored temporarily in the buffer 2302 using the synchronized clock signals, resulting in an output of the apparatus.
FIG. 25 is a diagram for explaining transition of a buffer in the case of decoding video data, where a lateral axis represents time and a longitudinal axis represents a buffer occupation. This figure does not illustrate transition itself of the buffer 2302 in FIG. 23 but illustrates transition of a virtual buffer as a model. Namely, a transition of a buffer defined as a virtual buffer model in MPEG2 is shown in this figure. It is assumed that data is input to the buffer at a given transfer rate through a transmission path and a decoder performs decoding in a short time every 1/30 second, i.e, for each frame. Therefore, data required for decoding of each frame is fetched from the buffer every 1/30 second. In encoding according to MPEG2, buffer status of a decoder is reproduced virtually by using a virtual buffer model and data is sent under control so that overflow and underflow may not occur in the buffer.
Reference numeral 251 in FIG. 25 indicates a normal status. In the normal status, clocks of the reproducing apparatus is synchronized with clocks of the encoding apparatus, so that processing is performed with no overflow or underflow mentioned later. Reference numeral 252 indicates a status in which speed of the clocks of the encoding apparatus is somewhat higher than that of the reproducing apparatus. In this case, since the encoding apparatus operates at high speed, the reproducing apparatus is in a status in which a transfer rate of input becomes higher. As a result, data to be input is more than data to be fetched as represented by 252 and accordingly buffer occupation of 252 becomes significantly higher than that of 251, resulting in overflow above buffer maximum at one point as shown by a and loss of data. On the other hand, in the case of higher speed clocks of the reproducing apparatus, a transfer rate is practically lower. As a result, as represented by 253, occupation becomes lower gradually, resulting in underflow below lower limit as represented by b in the figure which causes discontinuity of reproduction of motion pictures. Thus, when clocks of the encoding apparatus are different from clocks of reproducing apparatus, speed of transmitted data and speed of decoded data are varied from each other, causing overflow or underflow in the buffer of the reproducing apparatus.
For the reason mentioned above, clock information is multiplexed into multiplexed data as shown in FIG. 24 and synchronized clocks obtained on the basis of the clock information are used in the reproducing apparatus, thereby the problem previously mentioned is avoided.
As concerns image encoding, attention has been paid to an object encoding method in which components constituting an image, i.e., a background, characters, moving objects or the like are handled independently, respectively, and encoding is performed for each object. In the object encoding, since encoding is performed for each object, editing such as replacing specific objects can be performed with ease.
However, in production of multiplexed data and decoding and reproduction thereof according to the prior art, the following problem arises. According to the prior art, decoding and reproduction can be performed on the basis of the same time base on the assumption that multiplexed data is produced in the same encoding apparatus. Therefore, for the case of performing object encoding, if each object included in one piece of multiplexed data can be encoded in the same encoding apparatus and can be processed using the same time base, decoding and reproduction can be performed with ease. However, since each object to be edited is not always encoded by the same encoding apparatus, it does not always have the same base. In such case, synchronization and reproduction of objects cannot be performed using the prior art method.
For example, when multiplexed two objects are decoded using clocks synchronized with clocks of one encoding apparatus, one object can be reproduced with ease and clocks for the other object has not been synchronized. As a result, the buffer overflows or underflows, and decoding and reproduction cannot be performed normally.