The present invention relates to an audio-video signal transmission apparatus and can be applied to a case of satellite-broadcasting video and audio signals that have been subjected to data compression according to the MPEG (moving picture experts group) scheme, for instance. The invention is intended to enable efficient processing of video and audio signals by recording, as separate files, a video signal transport stream and an audio signal transport stream that are formed by, for instance, a data-compressed video signal and audio signal, respectively, and multiplexing those transport streams in transmitting a program.
In conventional broadcast systems, video and audio signals are edited in the form of baseband signals and then transmitted after being modulated into a format that is suitable for a transmission line.
FIG. 1 is a block diagram of a satellite broadcasting system 1 that transmits video and audio signals as MPEG transport streams. The satellite broadcasting system 1 broadcasts multiplexed transport streams T1A-T1N that are respectively output from a plurality of servers 2A-2N after time-divisionally multiplexing those with a multiplexer (MUX) 3.
Having approximately the same configuration, the servers 2A-2N each store a video signal SV and an audio signal SA in a recording device such as a hard disk drive (HDD) 4. Each of the servers 2A-2N stores and holds a video signal SV and an audio signal SA after converting those into a baseband format, and selectively output those under the control of a system control means 5.
FIFOs (first-in first-out) 5A and 5B hold, on a system-by-system basis, two systems of video signals SV and two systems of audio signals SA that are output from the hard disk drive 4 as shown in FIGS. 2A-2D. For example, while the FIFO 5A sequentially receives and outputs a video signal SV1 and an audio signal AV1 of a program that is currently broadcast (see FIGS. 2A and 2B), the FIFO 5B sequentially receives and outputs with predetermined timing a video signal SV2 and an audio signal AV2 of a commercial message that will be broadcast next (see FIGS. 2C and 2D).
A selecting means 6 selects and outputs video and audio signals that are output from the FIFOs 5A or 5B by switching between the contacts at a predetermined time point t1 under the control of the system control means 5. In this manner, the selecting means 6 sequentially outputs video signals SVM and audio signals AVM according to a preset transmission list, for instance.
An encoder (ENC) 7 converts a video signal SVM and an audio signal AVM into an MPEG multiplexed transport stream T1A. That is, the encoder 7 divides a video signal SVM (see FIG. 2E) into GOPs (groups of pictures) and sequentially codes the GOPs according to the MPEG coding scheme. Further, the encoder 7 generates video packets V1, V2, V3, . . . (see FIG. 2G) by dividing a bit stream that has been obtained by the above coding process into parts of a predetermined number of bytes (for instance, 188 bytes) and sequentially giving IDs or the like thereto. At this time, the encoder 7 controls the coding operation so that eight video packets V1, V2, V3, . . . , for instance, are generated for one GOP. In this manner, the encoder 7 generates a video signal transport stream TV as a series of video packets V1, V2, V3, . . . .
The encoder 7 also divides an audio signal SAM into audio units that approximately correspond to the GOPs of the video signal SV and codes the audio units according to the MPEG scheme. Further, the encoder 7 generates audio packets A1, A2, A3, A4, . . . (see FIG. 2H) by dividing a bit stream that has been obtained by the above coding process into parts of a predetermined number of bytes and giving IDs or the like thereto. At this time, the encoder 7 controls the coding operation so that four audio packets A1, A2, A3, and A4, for instance, are generated for one audio unit. In this manner, the encoder 7 generates an audio signal transport stream TA as a series of audio packets A1, A2, A3, A4, . . .
Further the encoder 7 generates a multiplexed transport stream T1A (see FIG. 2I) by multiplexing the thus-generated video packets V1, V2, V3, . . . and audio packets A1, A2, A3, A4, . . . To allow the decoding side to reduce the buffer memory capacity, in doing the multiplexing the encoder 7 delays the audio packets A1, A2, . . . with respect to the respective video packets V1, V2, . . . by a predetermined time.
The multiplexer 3 time-divisionally multiplexes and outputs multiplexed transport streams T1A-T1N that are output from the respective servers 2A-2N by assigning those to respective transmission packet. In the satellite broadcasting system 1, a multiplexed signal S1 of the multiplexed transport streams T1A-T1N that has been generated in the above manner is up-linked from a terrestrial station to a broadcasting satellite and broadcast from the broadcasting satellite.
In a reception side 8, as shown in FIGS. 3A-3E, a desired multiplexed transport stream T1A (see FIG. 3A) is selected, and then separated into a video signal transport stream TV and an audio signal transport stream TA (see FIGS. 3B and 3C) by a demultiplexer (DMUX) 9A. The video signal transport stream TV and the audio signal transport stream TA are decoded into a video signal SV and an audio signal SA (see FIGS. 3D and 3E) by a decoder (DEC) 9B.
By the way, a multiplexed transport stream that is generated in the above manner has a feature of a smaller data amount than baseband video and audio signals because it is compressed. Therefore, if video signals SV and audio signals SA are stored in the form of multiplexed transport streams T1A-T1N, the capacity of the data storing means such as the hard disk drive 4 could be reduced accordingly, that is, the video signals SV and the audio signals SA could be stored more efficiently. Even in transmitting a video signal SV and an audio signal SA within each server, if they are transmitted after being converted into a multiplexed transport stream, they could be transmitted more efficiently.
FIG. 4 shows a configuration to realize the above concept. A video signal SV and an audio signal SA can be stored after being converted into a multiplexed transport stream by providing the encoder 7 on the input side of the hard disk drive 4.
However, in this configuration, when program switching is made by the selecting means 6, it is difficult for the reception side 8 to decode an audio signal correctly.
In the MPEG scheme, data compression of a video signal is performed by effectively utilizing the correlation between consecutive frames. Therefore, it is difficult to decode a video signal correctly if even one packet is lost for one GOP. For example, as shown in FIGS. 5A-5G, when the program to be broadcast is switched from a first program to a second program switching between the contacts of the selecting means 6 is made at a time point t1 when one GOP of a multiplexed transport stream T1AA of the first program is finished (see FIG. 5A). As for a multiplexed transport stream T1AB of the second program (see FIG. 5B), the selecting means 6 is switched so that a video packet V201 of one GOP is started at the time point t1. In this manner, video packets of a video signal can be arranged so as to become continuous on a GOP basis (see FIG. 5D). In the reception side 8, a video signal SV can be reproduced correctly (see FIG. 5F).
However, if switching between the multiplexed transport streams T1AA and T1AB of two systems is made on a GOP basis, since audio packets are delayed from the respective video packets, audio packets A103 and A104 that constitute the same audio unit are lost (see FIG. 5E).
As in the case of a video signal transport stream, if even one packet is lost for one audio unit in an audio signal transport stream, an audio signal SA cannot be decoded correctly (see FIG. 5G). That is, such an audio signal will cause abnormal sound in the reception side 8. Thus, this type of broadcasting system cannot be put into practice.