1. Field of the Invention
The present invention relates to data conversion apparatuses and methods, data distribution apparatuses and methods, and data distribution systems. More specifically, the present invention relates to a data conversion apparatus and method, a data distribution apparatus and method, and a data distribution system, in which multimedia data including video data, etc. is distributed via a network in such a manner that a trick play of the data is allowed on a receiver terminal.
2. Description of the Related Art
A conventional data distribution system is known in which multimedia data including, for example, video data and audio data are compressed and stored on a server apparatus, and the multimedia data is distributed from the server apparatus via a transmission medium to a decoding terminal on the receiving end, where the multimedia data is decoded for playing.
When the conventional data distribution system is used in a network at home, the video data is encoded in accordance with, for example, ISO/IEC 13818-2 (MPEG-2 Video). ISO/IEC 13818-2 dictates that the video data be encoded so as not to cause overflow or underflow of a decoder buffer conforming to MPEG-2 standards, which is called a VBV (Video Buffer Verifier) buffer.
More specifically, referring to FIG. 14, the video data must be encoded with reference to decode time stamps (DTS) so as not to cause overflow or underflow of the VBV buffer with considerations to the buffer size of the VBV buffer (vbv_buffer_size). The VBV buffer receives the video data at the transmission rate of the video data (reflected in the gradient of the line indicating the buffer occupancy) and outputs the video data at a decode timing specified by the DTS.
For example, the nth video frame (n) having a data size S(n) is removed from the VBV buffer at a decoding time DTS(n). The video data to be removed from the VBV buffer must have already been input to the VBV buffer before the decoding time thereof. The time between the beginning of the input of a video frame and the decoding time of the video frame is referred to as a VBV delay (vbv_delay), which is encoded in the header of data into which the video frame is encoded.
Methods of encoding video frames, provided by ISO/IEC 13818-2, include I-picture (intra picture) in which encoding is based only on data within a frame, B-picture (bidirectionally predictive picture) in which encoding employs inter-frame motion prediction, and P-picture (predictive picture). The presentation time of a B-picture is equal to the decoding time thereof. Meanwhile, the presentation time of an I-picture or a P-picture is equal to the decoding time of the next I-picture or P-picture so that the I-picture and the P-picture are allowed to be used for prediction.
Furthermore, in the conventional data distribution system, as video data for transmission, for example, elementary stream data is packetized using transport streams (TS) defined in ISO/IEC 13818-1 (MPEG-2 system), and time-multiplexed with other elementary stream data, etc. The unit of decoding in the elementary stream data, such as a video picture, is referred to as an access unit. Referring to FIG. 15, when TS is packetized, an elementary stream constituted of a plurality of access units (AU), shown in (a), is initially packetized into a packet structure called a PES packet, shown in (b). The header of the PBS packet may include encoded time information (i.e., decode time and presentation time) of the first access unit in the PES packet. The PBS packet is then packetized into a transport packet and time-multiplexed with other transport packets, whereby a single multiplexed stream is formed, as shown in (c). Stuffing data is inserted as required in order to align the first byte of the PES packet with the beginning of the payload of the transport packet.
Furthermore, in the conventional data distribution system, for example, as video data for transmission, video data is packetized using transport streams (TS) defined in ISO/IEC 13818-1 (MPEG-2 system), and multiplexed with other data, etc. ISO/IEC 13818-1 defines a decoder model, shown in FIG. 16, for decoding of TS.
Referring to FIG. 16, the decoder model includes a switching unit 101 for switchingly outputting TS which have been input, a transport buffer (TB) 102, a multiplexing buffer (MB) 103, an elementary buffer (EB) 104, a video decoder (DVideo) 105, a reordering buffer 106, and a switching unit 107. Video packets in the TS which have been input are selected by the switching unit 101 and input to the transport buffer 102 at the input rate, fed to the video decoder 105 via the buffers 102, 103, and 104, and are then output.
In the decoder model, the buffer size of each of the transport buffer 102, the multiplexing buffer 103, and the elementary buffer 104 are predetermined, and data transfer rates between each of the buffers are also defined.
In the decoder model, the elementary buffer 104 is equivalent to the VBV buffer for the video data. ISO/IEC 13818-1 dictates that the video data be packetized so as not to cause overflow or underflow of any of the buffers.
More specifically, in order to satisfy the decoder model and alignment restrictions in the multiplexed data, the server apparatus must determine a multiplexing schedule with reference to the size time information of access units. When the server apparatus converts the elementary stream data so as to allow trick play thereof, the size and time information of access units in the elementary stream data are changed. Thus, after the data is converted for trick play, a real-time processing is required for multiplexing.
As an example of the multiplexing apparatus, an authoring system which multiplexes a plurality of elementary streams to generate multiplexed data is disclosed in Japanese Unexamined Patent Application Publication No. 9-162830. The multiplexing apparatus includes an encoding unit which generates access unit information including access unit size and time information, so that a multiplexing schedule can be determined using the access unit information and without parsing the elementary streams themselves.
A data distribution system with a trick play capability includes, for example, a server apparatus 200 and a decoding terminal 300, as shown in FIG. 17.
Referring to FIG. 17, in the data distribution system, the server apparatus 200 includes a data storage unit 201, a trick play controlling unit 202 to which a trick play request signal is to be input, a data conversion unit 203 which generates trick play data based on a trick play control signal in accordance with the trick play request signal input from the trick play controlling unit 202, a multiplexing unit 204, and a transmitter unit 205. The decoding terminal 300 includes a receiver unit 301 for receiving transmission data from the server apparatus 200 via a transmission medium 400, and a decoding unit 302 for decoding the trick play data from the receiver unit 301 and presenting the decoded data on a display apparatus (not shown) for the user.
In the data distribution system, the data conversion unit 203 includes a decoder to which the trick play control signal from the trick play controlling unit 202 and video data from the data storage unit 201 is input, and an encoder which re-encodes the data decoded by the decoder.
The decoder, in accordance with the trick play control signal from the trick play controlling unit 202, reads designated video data from the data storage unit 201 by a most suitable method in accordance with the type of trick play. For example, if the trick play control signal specifies a fast-forward play as the type of trick play, the decoder reads the designated video data from the data storage unit 201 while skipping B-pictures not used for decoding.
The decoder decodes the video data which has been read, and supplies the decoded data to the encoder as a decoded video signal. The decoded video data reflects the way in which the video data has been read from the data storage unit 201; that is, the decoded video signal reflects the type of trick play.
The encoder encodes the decoded video signal from the decoder, and outputs the encoded data to the multiplexing unit 204 as trick play video data. Because the decoded video signal from the decoder reflects the type of trick play, the trick play video data is in accordance with the type of trick play. For example, if the video data is encoded in accordance with ISO/IEC 13818-2, the trick play data output from the encoder is in accordance with ISO/IEC 13818-2.
The multiplexing unit 204 needs to parse the trick play data and thereby retrieve data size and time information thereof in order to determine a multiplexing schedule which satisfies the decoder model and alignment restrictions of the multiplexed data.
When the server apparatus 200 transmits the trick play data for trick play, for example, a fast-forward play or a pause, via the transmission medium 400 to the decoding terminal 300, a trick play request signal from a user requesting a trick play is input to the trick play controlling unit 202. In response thereto, the data conversion unit 203 reads elementary stream data from the data storage unit 201, and converts the elementary stream data into trick play data in accordance with the trick play request signal. Thus, in order to allow the trick play, the server apparatus 200 needs to convert the elementary stream data and to perform the multiplexing of the elementary stream data by a real-time processing, not being allowed to store data to be multiplexed in the data storage unit 201.
Furthermore, the data conversion unit 203 re-encodes the video data by the decoder and the encoder, incurring a high processing load and possibly degrading the picture quality. In addition, because the processing delay relating to the data conversion is large, the delay between the input of the trick play request signal and the presentation of the contents of the trick play on the decoding terminal 300 increases.
Alternatively, the trick play data may be generated by changes on the bitstreams, without re-encoding by the decoder and the encoder. For example, in order to perform a pause as the type of trick play, repeat pictures, indicating no change from a previous picture, are inserted in an elementary stream for normal play. Repeat pictures in accordance with MPEG-2 Video are entirely constituted of skip macro blocks, indicating that a picture used for prediction be repeated. Because the data size of the repeat pictures is small, by adding stuffing data, requirements for the VBV buffer are satisfied.
When the trick play data in accordance with the trick play request is produced by the data conversion unit 203 converting the elementary stream data, the multiplexing unit 204, in order to determine a multiplexing schedule which satisfies the decoder model and alignment restrictions of multiplexed data, needs to parse the converted elementary stream data, i.e., the trick play data, to thereby retrieve the data size and time information thereof. Because the elementary stream data has been converted for the trick play, the multiplexing of the elementary stream data must be performed by a real-time processing. As the number of elementary streams to be multiplexed increases, such as video data, audio data, and caption data, the processing load of the multiplexing unit 204 for parsing the elementary stream data also increases. Furthermore, the multiplexing unit 204 also suffers an increased processing load as the resolution and/or the rate of video data increases.
Furthermore, when the elementary stream data is multiplexed a plural number of times by different combinations, the multiplexing unit 204 needs to parse the elementary stream data each time.
Furthermore, in the server apparatus 200, when the data conversion unit 203 converts the elementary stream data for normal play, the processing load of the data conversion unit 203 for the conversion and input/output considerably increases as the number and/or rate of the elementary streams increases, similarly to the multiplexing unit 204.