1. Field of the Invention
The present invention relates to a signal transmission method and a signal transmission apparatus. More particularly, the present invention relates to a method and apparatus for signal transmission, in which a plurality of streams formed of packetized signals are selected, and then concatenated into a single output stream for transmission. When the streams are concatenated, System Time Clocks are synchronized across a plurality of stream output devices for outputting the stream, and the continuity of Program Clock References (PCR), Presentation Time Stamps (PTS), and Decoding Time Stamps (DTS) in the output stream is assured. The stream output device is controlled so that any stream having information is not transmitted at the switching of streams.
2. Description of the Related Art
In digital broadcasting, pictures and voices are transmitted using TS's (Transport Streams) complying with the MPEG (Moving Picture Experts Group) 2 Standard, which has been standardized as ISO (International Organization for Standardization)/IEC (International Electrotechnical Commission) 13818-1.
FIG. 12A shows stream (data chain) ES of compressed data of pictures and voices. The streams of compressed data are packetized and are the tagged with a PES (Packetized Elementary Stream) header. The PES stream shown in FIG. 12B is thus formed. The PES stream is then packetized and is then associated with a TS (Transport Stream) header containing a program time reference value PCR (Program Clock Reference) as shown in FIG. 12C. The TS packet, each as long as 188 bytes, is thus created. A single transport stream (TS) is constructed of a plurality of TS packets.
A signal transmission apparatus, which switches a plurality of TS's for transmission, has no proper point where two Transport Streams are concatenated without no transients introduced for the following reasons.
For example, in a video encoded in compliance with the MPEG 2 (ISO/IEC 13818-2), the amount of information such as GOP (Group of Pictures) unit varies depending on the difficulty of encoding, even though the average of encoding information amount is constant. A plurality of ES's of pictures and voices is packetized, forming PES streams. The PES stream is then split into TS packets, each having a predetermined data amount. The transmission information amount of video is fixed to an average value. For this reason, the transmission time of the TS per unit GOP is not constant. Variations occur in the relative delay time between data input to an encoder and an encoded TS picture information output from the encoder.
Two encoded picture signals, even if encoded from the same origin through the same encoding means having the same format, offer no guarantee that the start end positions of the GOP coincide with each other.
If TS switching is performed for switching between pictures in such a situation, a lap occurs between the GOPs at the switching time of TS's. The GOPs in the TS's thus suffer from information loss. If there occurs a gap between the streams at the switching time, another GOP is then partly included, permitting unwanted information to be added. A receiver apparatus receiving such a TS is unable to perform correct signal processing based on the TS, a transient possibly takes place in the video signal output obtained through signal processing.
As for voice, encoded voice, encoded in compliance with the MPEG 2, BC (Backward Compatible) (ISO/IEC13818-3) Standard, has a constant encoding information amount. Since a TS is formed by multiplexing the encoded voice with the multiplexed video, variations occur in the relative delay time between data input to an encoder and an encoded TS voice information output from the encoder. In the voice, encoded in compliance with the MPEG2 voice, AAC (Advance Audio Coding) (ISO/IEC13818-7) Standard, the average of the encoding information amount is constant, but the information amount per encoding unit varies depending on the difficulty of encoding. Therefore, variations occur in the relative delay time between data input to an encoder and an encoded TS voice information output from the encoder.
As is the case with the picture signal, in accordance with the two Standards, if a lap occurs between the voice encoding units between the TS's at the switching time, the TS's prior to and subsequent to the switching suffer from information loss in voice encoding units. If there occurs a gap between the streams at the switching time, another voice encoding unit is included, permitting unwanted information to be added. A receiver apparatus receiving such a TS is unable to perform correct signal processing based on the TS, a transient possibly takes place in the voice signal output obtained through signal processing.
Like the picture and voice data, coded data suffers variations in the relative delay time between TS data. If a lap occurs between the data of the TS's at the switching time, the TS subsequent to the switching suffers from partial information loss. If the TS suffers from a gap, part of another data is included, permitting unwanted information to be added. Unable to perform a correct processing, a receiver apparatus cannot present correct information or stops presenting information.
Information such as Program Specific Information (PSI) or Entitlement Control Message (ECM) may be transmitted together with pictures and voices in the TS. Japanese Post Office Regulations proposed the transmission period for transmitting these pieces of information in the DVB (Digital Video Broadcasting) recommendations. If multiplexing points fail to coincide with each other between switched TS's, the transmission period of the TS subsequent to switching is disturbed, and the recommended transmission period may not be observed. The display of the pictures and the timing of the audio output may be unstable. The transmission timing of data such as EPG (Electronic Program Guide) can also vary. If there occurs a lap between the TS's at switching, the TS's of the EPG partly suffer from information loss. If there occurs a gap between the TS's at switching, another EPG may be partly included, adding unwanted information. For this reason, no correct processing is performed. A receiver apparatus may not present correct information or may stop presenting the information.
Since the TS has typically a mix of a plurality of types of TS packets, switching all TS's is even more difficult.
To resolve these problems, encoded pictures and voices contained in the TS may be decoded, the decoded pictures and voices may be then concatenated using a known technique, and then may be re-encoded again. However, this method creates new problems such as an increase in the delay of data and degradation in the characteristics of data.
Since encoding and decoding through the MPEG2 Standard need certain processing time, a system performing decoding/re-encoding and a system performing no decoding/re-encoding make a substantial difference therebetween in delay time involved. For this reason, the adjustment of the delay time in an entire transmitter system needs to be performed with respect to the decoding/re-encoding system having the substantial delay time. If all other systems are adjusted with respect to the typically less frequently used decoding/re-encoding system, costs and space involved become substantially large. This delay problem becomes serious in two-way transmission applications.
Since in the MPEG2 Standard, pictures and voices are compressed in a lossy coding, a decoding technique cannot fully restore the data in the original decompressed state thereof. If a decoding/re-encoding operation is performed, the data is substantially degraded in image quality and audio quality. Since the method of decoding/re-encoding is not applicable to data other than picture and audio data, the pictures and voices are not correctly presented due to characteristics of the PSI and ECM.