In a synchronous reproduction apparatus for synchronously reproducing a video signal and an audio signal, digitally encoded video and audio signals are given respective display time data and packeted, and information in which these signals and data are multiplexed is recorded in transmission media, such as storage media or communication media. The multiplexed bit stream recorded in such a transmission medium is transmitted to a decoder, wherein the bit stream is demultiplexed to a video signal and an audio signal and these signals are respectively decoded and reproduced. At the time of reproduction, it is indispensable to synchronize the video signal with the audio signal. For example, synchronization at lip sync level, which means coincidence of words of a person and motion of his/her mouth in a movie or the like, must be restricted within an error of about 1/10 second.
In MPEG which is an international standard of compression coding of moving pictures, in order to make such a synchronization, an encoder is provided with a system clock serving as a time base and SCR (System Clock Reference) or PCR (Program Clock Reference) serving as a time reference of the system clock, and the system clock is set or corrected to a correct time with reference to the value of the SCR or PCR. Further, bit streams of digitally encoded video and audio signals are given data of PTS (Presentation Time Stamp) showing times for presentation of video frames and audio frames which are fundamental units of reproduction of a video signal and an audio signal, respectively, and these signals and data are packeted.
A decoder is provided with an internal clock serving as a time base, similar to the system clock of the encoder and, when the time shown by the internal clock coincides with a video presentation time and an audio presentation time respectively added to the video frame and the audio frame, the video frame and the audio frame are presented, whereby synchronous reproduction and output of the video signal and the audio signal are realized. The frequency of the internal clock of the decoder is counted up at 27 MHz equal to the frequency of the system clock. In addition, data of SCR or PCR serving as a time reference for setting or correcting the internal clock to a correct time is transmitted together with tile multiplexed bit stream of the video signal and the audio signal, from the encoder through a transmitting medium to the decoder.
FIG. 14(a) is a diagram for explaining the conception of synchronous reproduction of a video signal and an audio signal. In the figure, an intermediate stage shows an internal clock possessed by a decoder, and the time is counted up as proceeding toward the right side. An upper stage and a lower stage show a reproduced video signal and a reproduced audio signal, respectively, in which video frames and audio frames are successively presented along the time axis. Further, the presentation period of the video frame is 5, and the presentation period of the audio frame is 4. A number given to the head of each frame shows the presentation time corresponding to the frame.
As mentioned above, when the video signal and the audio signal have different presentation frequencies or when the system clock of the encoder and the internal clock of the decoder have different frequencies, synchronization error may occur between the reproduced video signal and the reproduced audio signal. Furthermore, depending on the system construction method employed for the decoder, coincidence with each frequency of the encoder is not made, resulting in synchronization error between the reproduced video signal and the reproduced audio signal.
FIG. 14(b) shows a case where the presentation frequency of the video signal is shifted in the decoder. In the figure, the frequency of the video signal in the decoder is 5/6 of the frequency of the video signal in the encoder, and the presentation period of the video frame is changed from 5 to 6. As a result, the relative positions of the video frame and the audio frame are shifted, and a video frame 1301 whose video presentation time is 20 is delayed by time 4 compared with an audio frame whose video presentation time is 20, resulting in synchronization error between the reproduced video signal and the reproduced audio signal.
As mentioned above, in the decoder, since the video frame and the audio frame are presented when the video presentation time and the audio presentation time respectively possessed by the video frame and the audio frame coincide with the time of the internal clock, decoding and reproduction can be performed while maintaining synchronization between the video signal and the audio signal. To maintain synchronization between the reproduced video signal and the reproduced audio signal, the video presentation time and the audio presentation time possessed by the video frame and the audio frame are compared with the time of the internal clock, and differences of the video presentation time and the audio presentation time from the time of the internal clock are detected, followed by correction of the presentation timing.
When decoding and reproduction of the audio signal is carried out with the audio presentation time of each audio frame being adjusted to the time of the internal clock, some audio frames fail to be presented, resulting in discontinuity in the reproduced audio signal. This case will be described in detail using FIGS. 15(a)-15(c). FIG. 15(a) shows a case where the audio presentation time of each audio frame does not coincide with the time of the internal clock, and synchronization error occurs between the reproduced video signal and the reproduced audio signal. When the synchronization error is removed by adjusting the audio presentation time of each audio frame to the time of the internal clock, time discontinuity occurs in the successive audio frames and the audio signal is not reproduced smoothly, resulting in a degradation in tone quality. This degradation in tone quality is easily sensed by ears of human beings.
To avoid such a degradation in tone quality, an audio master system, in which an output of a reproduced audio signal is regarded as important, is applied to the conventional decoder, as disclosed in Japanese Published Patent Application No. Hei. 7-50818, for example. The audio master system will be described using FIG. 15(c). As shown in FIG. 15(c), in the audio master system, the time of the internal clock is updated using the audio presentation time of each audio frame, simultaneously with presentation of the audio frame. Hence, no time discontinuity occurs in the successive audio frames, and the audio signal is smoothly reproduced and output. At this time, the video presentation time of each video frame is compared with the time of the internal clock, and presentation of the video frame is advanced or delayed according to the result of the comparison.
FIGS. 16(a) to 16(d) are diagrams for explaining the operation for presenting video frames using the audio master system. FIG. 16(a) shows a state where the internal clock is updated using the audio presentation time of each audio frame, and the video presentation time does not coincide with the time of the internal clock. Initially, the time of the internal clock is subtracted from the video presentation time of the video frame to obtain a differential value, and this differential value is compared with a prescribed range. When the differential value is not within the range, presentation of the video frame is controlled, i.e., advanced or delayed. This range is an allowable range of synchronization error between the reproduced video signal and the reproduced audio signal, and it is set at -5 to +5, for example.
With reference to FIG. 16(a), at time 26 of the internal clock, a differential value of the video frame 1501 is -6, and this value is not within the allowable range of -5 to +5. The fact that the differential value is -6 means that the time for presentation of the video frame 1501 has passed already. In this case, as shown in FIG. 16(b), the video frame 1501 is not presented, and the next video frame 1502 is presented.
Further, in FIG. 16(c), at time 12, a differential value of the video frame 1503 is +6, and this value is not within the allowable range of -5 to +5. The fact that the differential value is +6 means than the time for presentation of the video frame 1503 has not reached yet. In this case, as shown in FIG. 16(d), the video frame 1504 is presented again.
As described above, in the decoder using the audio master system, synchronization error is removed by controlling only presentation of the video frame while maintaining temporal continuity of the audio frame, so that synchronization between the reproduced video signal and the reproduced audio signal is maintained without degrading the tone quality.
Hereinafter, a first problem to be solved by the present invention will be described using FIGS. 17(a) and 17(b). FIGS. 17(a) and 17(b) are diagrams for explaining a case where a start portion of data is input in the starting state of the decoder employing the audio master system. As shown in FIG. 17(a), the audio presentation time of the audio frame in the stating state is 0, and the video presentation time of the video frame in the starting state is 1. When the audio presentation time is earlier than the video presentation time, synchronous reproduction of the video signal and the audio signal is possible, so that no problem arises.
However, as shown in FIG. 17(b), when the audio presentation time (time 5) of the audio frame is later than the video presentation time (time 0) of the video frame in the starting state, since the count-up of the internal clock is started from the audio presentation time 5 of the initial audio frame 1601, presentation of the video frame 1602 at the video presentation time 0 is skipped, resulting in absence of head of the video signal.
Next, a second problem to be solved by the invention will be described using FIGS. 18(a) and 18(b). FIG. 18(a) shows a case where temporal discontinuity exists in the audio presentation time in the normal state where both the reproduced video signal and the reproduced audio signal are normally output in the decoder employing the audio master system. In FIG. 18(a), the presentation period of video frames is 5 and the presentation period of audio frames is 4, and a time jump occurs between the audio frame 1704 and the audio frame 1705. At this time, as shown in FIG. 18(b), since temporal continuity of the reproduced audio signal is regarded as important in the decoder employing the audio master system, the internal clock is updated using the audio presentation time 20 of the audio frame 1705 simultaneously with presentation of the audio frame 1705. So, though the video frame 1702 is intended as a frame to be presented next, since a differential value between the video presentation time 15 of the video frame 1702 and the time 23 of the internal clock is -8, that is, not within the allowable range of -5 to +5, presentation of the video frame 1702 is skipped and the next video frame 1703 is presented. However, this video frame 1703 is not in the accurately synchronized state.
As described above, discontinuity in the audio presentation time causes omission of a video frame, so that smooth presentation of video frames is not possible. In addition, a video frame next to the omitted video frame is not reproduced in the accurately synchronized state.
Next, a third problem to be solved by the invention will be described using FIGS. 19(a)-19(c). FIG. 19(a) shows a case where temporal discontinuity occurs in input data of a multiplexed bit stream due to track jumping or the like in the normal state where both the reproduced video signal and the reproduced audio signal are normally output in the decoder employing the audio master system. In FIG. 19(a), the presentation period of the video frame is 5 and the presentation period of the audio frame is 4, and time jump occurs between the video frame 1801 and the video frame 1802 and between the audio frame 1804 and the audio frame 1805. At this time, as shown in FIG. 19(b), when the video frames are presented with attaching importance to continuity of the reproduced video signal without applying the audio master system to the decoder, the same effect as scene change is obtained, and no problem arises.
However, as shown in FIG. 19(c), in the decoder employing the audio master system, when the video frame 1802 is presented, since the internal clock is updated by the presentation time 24 of the audio frame 1805, a differential value between the video presentation time 20 of the video frame 1802 and the time 10 of the internal clock is +10, and this value exceeds the allowable range, so that presentation of the video frame 1801 is performed again. Though the video frame 1802 is intended as a frame to be presented next, since a differential value between the video presentation time 20 of the video frame 1802 and the time 27 of the internal clock is -7, that is, below the allowable range, presentation of the video frame 1802 is skipped and the next video frame 1803 is presented. However, this video frame 1803 is not in the accurately synchronized state.
As described above, in the decoder employing the audio master system, when time discontinuity occurs in the multiplexed bit stream due to track jumping or the like during the normal operation, smooth presentation of video frames cannot be performed because of the skipped video frame and, furthermore, a video frame next to the skipped video frame cannot be reproduced in the accurately synchronized state.