A method for relatively easily performing variable high-speed reproduction from one source of audiovisual content in streaming is extracting audiovisual content on a picture basis and distributing to a reproduction device. With regard to this method, however, it is pointed out as a problem that discontinuity of audio data, namely, a data gap is caused and, on the reproduction device, times to output video become uneven, the quality of audio reproduction decreases and underflow or saturation of buffer data occurs.
For variable high-speed reproduction by a reproduction device in streaming distribution, the following methods are proposed:    (1) a distribution device increases an audiovisual content distribution speed to a reproduction speed required by the reproduction device so that the reproduction device can reproduce at the required reproduction speed;    (2) the distribution device keeps audiovisual content previously generated so as to be compatible with a plurality of reproduction speeds, and the distribution device selects and distributes audiovisual content corresponding to a reproduction speed required by the reproduction device;    (3) the distribution device generates in real time and distributes audiovisual content that can be reproduced at a reproduction speed required by the reproduction device; and    (4) by using the characteristic of a data structure of audiovisual content to be distributed, audiovisual content is thinned on a picture basis and distributed.
However, the method (1) has a problem that a network bandwidth used at the time of distribution increases in proportion to the reproduction speed and a large portion of the network bandwidth is wasted when the reproduction speed is high, so that it is not a practical method. The method (2) has a problem that plural types of audiovisual content are necessary for one piece of audiovisual content, and hence, an additional storage region therefor is required and management of content becomes complicated. Moreover, in the method (3), the distribution device needs to decode and encode audiovisual content in real time. Such a processing load is generally high. Considering an influence on the performance like the number of simultaneous distributions, it is not a practical method.
The method (4) is an effective method that solves the respective problems of the methods (1), (2) and (3) and can be relatively easily realized. For example, by MPEG-2 and H.264/MPEG-4 AVC, the distribution device can easily extract data for variable high speed from audiovisual content.
In general, as audiovisual content for streaming distribution, encoded video data and audio data are multiplexed by a transport stream packet (hereinafter, referred to as a TS packet) and used.
Video data and audio data are evenly multiplexed in audiovisual content. Therefore, even if the method (4) is used, audio data contained in a section of pictures are also thinned when pictures of audiovisual content are thinned. As a result, a gap of audio data transmitted to the reproduction device is caused.
When a gap of audio data is caused, the following problems (a) to (c) arise on the reproduction device.
(a) In the case of reproduction while audio is outputted, generally, a reproduction position of audio data is set as a reference, and the output timing of video display is kept with time of audio being reproduced. Therefore, in a case that gaps of audio data with which the timing is kept increase, times to output video becomes uneven.
(b) When video data is thinned on a picture basis, there is a possibility that the size of audio data distributed to the reproduction device is not accurately the size of the reciprocal of a reproduction speed (1/reproduction speed) depending on multiplexing of audiovisual content. This error is accumulated when high-speed reproduction is performed for long hours. When the size of distributed audio data is smaller than the reciprocal of the reproduction speed (1/reproduction speed) of original audio data, a problem affecting the reproduction quality arises on the reproduction device, for example, buffer underflow occurs and an image is not smoothly reproduced. On the contrary, when the size of distributed audio data is larger than the reciprocal of the reproduction speed (1/reproduction speed) of original audio data, buffer saturation occurs on the reproduction device, and a problem affecting the reproduction quality arises, such as both audio data and video data become discontinuous.
(c) Because the distributed audio data contains a discontinuous portion, there is a problem that, when the audio data is reproduced as it is, high frequency is generated and the sound quality lowers. Moreover, even when an audio waveform is multiplied by a window function so that this high frequency is not generated, there is a problem that the sound quality lowers.
[Patent Document 1] Japanese Unexamined Patent Application Publication No. JP-A 11-355719
Addressing the abovementioned problems (a) to (c) arising because of a gap of audio data, Patent Document 1 discloses a technique described below: firstly, multiplexing so that a packet of speech data having a PTS (Presentation Time Stamp) corresponding to time between a PTS of an I-picture and an PTS of a next picture is inserted between packets configuring the I-picture; and then, at the time of high-speed reproduction, separating and outputting the speech data having the PTS between the PTS of the I-picture and the PTS of the next picture, and thereby outputting a favorable speech synchronous with an image even in high-speed reproduction using compressed video data.
However, at the time of high-speed reproduction, there is a matter of whether a speech reproduced at high speeds can be heard or not. Therefore, audio is reproduced in the case of high-speed reproduction up to a desired speed, but there is a case that audio is not reproduced in the case of high-speed reproduction at a higher speed than the desired speed (there is no need to reproduce the audio because it cannot be heard). However, in the method disclosed in Patent Document 1, even when reproduction of audio is unnecessary because the audio is reproduced at a speed higher than the desired speed, all audio data is distributed, so that a problem of waste of a distribution bandwidth arises. In the method of Patent Document 1, the audio distribution bandwidth increases in proportion to a high-speed reproduction rate, so that waste of the distribution bandwidth increases as the reproduction rate becomes higher.