1. Technical Field of the Invention
The present invention relates generally to a stream editing apparatus designed to edit streams consisting of a combination of video and audio data codes, and more particularly to a stream editing apparatus designed to connect two streams, in real time, to produce a single stream without deterioration in image quality.
2. Background Art
In the MPEG format that is a typical one of video and audio code compression coding techniques, compression coding of moving pictures is achieved by decreasing the amount of data using the discrete cosine transform (DTC) and predictive coding.
FIG. 11 shows an encoder designed to perform such a coding operation.
The encoder includes a subtractor 41, a DCT circuit 42, a quantizer 43, a variable length coding circuit 49, a buffer 50, an inverse quantizer 44, an inverse DCT circuit 45, an adder 46, an image memory 47, and a motion vector detector 48.
When the interframe predictive coding is performed, the motion vector detector 48 compares input image data with image data stored in the image memory 47 to calculate a motion vector (MV) and outputs it to the image memory 47 and the variable length coding circuit 49.
The subtractor 41 reads motion-compensated image data out of the image memory 47 and outputs a difference between the input image data and the image data read from the image memory 47 to the DCT circuit 42. The DCT circuit 42 performs the discrete cosine transform on the difference data inputted from the subtractor 41 and outputs DCT coefficients thereof to the quantizer 43 to inform the variable length coding circuit 49 of the type of the DCT.
The quantizer 43 quantizes the DCT coefficients at quantization steps specified by quantizer matrix (Quant) from the buffer 50 and outputs results of the quantization and the quantizer matrix (Quant) to the variable length coding circuit 49. The results of the quantization are also outputted to the inverse quantization circuit 44.
The variable length coding circuit 49 codes a motion vector (MV) calculated by the motion vector detector 48, the type of DCT, the quantizer matrix (Quant), and the output of the quantizer 43. These coded data are stored in the buffer 50 temporarily and then outputted in the form of a stream.
The output of the quantizer 43 is also inverse-quantized by the inverse quantizer 44 and then subjected to the inverse DCT in the inverse DCT circuit 45. The output of the inverse DCT circuit 45 is added by the adder 46 to the image data read out of the image memory 47 to reproduce the input image data which is, in turn, stored in the image memory 47 again for use in the subtraction operation on a subsequent input image data.
When the intraframe coding is performed which does not use the interframe prediction, only the DCT is performed without reading the image data out of the image memory 47.
When the interframe prediction is used, the forward interframe predictive coding using a previous image stored in the image memory 47 and the backward interframe predictive coding using a future image stored in the image memory 47 may also be performed.
In the MPEG format, each input frame is coded into one of three pictures: an I-picture containing only intra-macroblocks, a P-picture containing intra- and forward interframe predictive coded microblocks, and a B-picture containing intra-, forward interframe predictive coded, and backward interframe predictive coded macroblocks.
FIGS. 10(a) to 10(d) show a sequence of processes which adds audio data to video data coded in the above described manner to produce a data stream in a format suitable for a storage medium.
Moving pictures are placed in an frame order different from that when they are inputted and coded into I-, P-, and B-pictures to produce a video elementary stream (ES). The audio data is compressed at an interval of, for example, 24 ms. to produce an audio ES. These ESs are each divided into suitable units and packetized together with headers. To each header, a PTS (Presentation Time Stamp) is added which indicates the time the packtized data is to be reproduced.
The thus formed video and audio packets are multiplexed as a pack along with a header to form a stream FIG. 9 shows typical packed streams. The streams are recorded in a storage medium such as a DVD. The playback is achieved by reading the streams out of the storage medium and decoding and reproducing each packet at the time specified by the PTS.
Usually, production of streams of desired contents from a plurality of streams recorded in a storage medium is accomplished by reading the streams out of the storage medium and rearranging them. For instance, when it is required to split a stream A into two sections and replace a stream B between them to produce a stream C, the first section of the stream A is decoded and then encoded again to prepare a leading portion of the stream C, after which the second section of the stream A is decoded and then encoded again to produce a trailing portion of the stream C. Such stream editing, however, encounters a drawback in that the re-encoding operation degrades the quality of edited image.
Editing techniques for coupling the streams directly to each other are also proposed, but they are subjected to restriction that the streams should be coupled at an I-picture of video data.
In the MPEG format, video data is grouped in units of a GOP (Group of Pictures) consisting of one I-picture and a plurality of P- and B-pictures. Each GOP usually contains fifteen (15) pictures (0.5 sec.). If a streams is divided at one of the P-pictures or the B-pictures, it is difficult to reproduce a frame image in units of a picture, thus leading to a problem that the image continues to be distorted until a subsequent I-picture appears.
The above editing is usually performed in an off-line operation and thus consumes much time.
It is therefore a principal object of the present invention to avoid the disadvantages of the prior art.
It is another object of the present invention to provide a stream editing system designed to connect two streams, in real time, to produce a single stream without deterioration in signal quality such as image quality.
According to one aspect of the invention, there is provided a stream editing apparatus which is designed to switch one of two input streams used in an editing operation to the other at an edit point set in each of the streams. The system comprises: (a) a first decoder decoding a first stream; (b) a second decoder decoding a second stream; (c) an encoder re-encoding at least one of the first and second streams decoded by the first and second decoders; and (d) a controller controlling editing of the first and second streams to produce a third stream made up of a combination of a leading segment of the first stream preceding the edit point set in the first stream and a trailing segment of the second stream following the edit point set in the second stream, the controller, in producing the third stream, combining a portion of at least one of the leading and trailing segments of the first and second streams which is decoded and re-encoded by a corresponding one of the first and second decoders and the encoder and which is defined to have a given length from the edit point set in the one of the first and second streams with other portions of the leading and trailing segments of the first and second streams before decoded and re-encoded by the first and second decoders and the re-encoder.
In the preferred mode of the invention, each of the first and second streams is made up of a plurality of groups of pictures (GOP). The controller defines the length of the portion which is decoded and re-encoded and which is to be combined in the third stream from edit point to a leading portion of one of the GOPs in which the edit point is set.
A header generator may further be provided which generates a series of headers for the third stream.
The encoder may re-encode the at least one of the first and second streams using coding information derived by decoding the at least one of the first and second streams through the first and second decoders.
The coding information includes a type of picture, a quantizer matrix (Quant), and a type of discrete cosine transform (DCT).
When it is required for the encoder to change a bit rate used in coding of one of the first and second streams in an re-encoding operation on the at least one of the first and second streams, the encoder uses a Quant which is changed according to a relation below
after-change Quant=before-change Quantxc3x97(after-change bit rate/before-change bit rate
Decoded video and/or audio data and the coding information may be transferred to the encoder through two separate signal lines from the first and second decoders disposed in an independent unit.
The apparatus may further include a demultiplexer means splitting the first and second streams into video packets and audio packets and a multiplexing means multiplexing the video packets in one of the first and second streams and the audio packets in the other stream.
When the portion which is to be combined in the third stream is defined in the trailing segment of the first stream, the encoder may re-encode the portion after completion of editing of the second stream.
The controller, in producing the third stream, may select a first portion of the leading segment of the first stream before decoded and re-encoded, a second portion of the leading segment of the first stream following the first portion and preceding the edit point which is outputted from the encoder, a first portion of the trailing segment of the second stream following the edit point which is outputted from the encoder, and a second portion of the trailing segment of the second stream following the first portion of the second stream before decoded and re-encoded. The controller defines a range of the second portion of the first stream from the edit point to a leading portion of one of the GOPs in which the edit point is set and a range of the first portion of the second stream from the edit to an end of one of the GOPs in which the edit point is set.
According to the second aspect of the invention, there is provided a stream editing method of switching one of two input streams used in an editing operation to the other at an edit point set in each of the input streams. The method comprises the steps of: (a) decoding a first and a second stream using two decoders; (b) re-encoding at least one of the first and second streams decoded in the decoding step; and (c) controlling editing of the first and second streams to produce a third stream made up of a combination of a leading segment of the first stream preceding the edit point set in the first stream and a trailing segment of the second stream following the edit point set in the second stream, the controlling step, in producing the third stream, combining a portion of at least one of the leading and trailing segments of the first and second streams which is decoded and re-encoded and which is defined to have a given length from the edit point set in the one of the first and second streams with other portions of the leading and trailing segments of the first and second streams before decoded and re-encoded.
In the preferred mode of the invention, each of the first and second streams is made up of a plurality of groups of pictures (GOP). The length of the portion which is decoded and re-encoded and which is to be combined in the third stream is defined from edit point to a leading portion of one of the GOPs in which the edit point is set.
The method may further include the step of generating a series of headers for the third stream.
The encoding step re-encodes the at least one of the first and second streams using coding information derived by decoding the at least one of the first and second streams through the decoding step.
The coding information includes a type of picture, a quantizer matrix (Quant), and a type of discrete cosine transform (DCT).
When it is required for the encoding step to change a bit rate used in coding of one of the first and second streams, the encoding step uses a Quant which is changed according to a relation below
after-change Quant=before-change Quantxc3x97(after-change bit rate/before-change bit rate
The method may further includes the step of splitting the first and second streams into video packets and audio packets and the step of multiplexing the video packets in one of the first and second streams and the audio packets in the other stream.
When the portion which is to be combined in the third stream is defined in the trailing segment of the first stream, the encoding step may re-encode the portion after completion of editing of the second stream.
The controlling step, in producing the third stream, may select a first portion of the leading segment of the first stream before decoded and re-encoded, a second portion of the leading segment of the first stream following the first portion and preceding the edit point which is decoded and re-encoded, a first portion of the trailing segment of the second stream following the edit point which is decoded and re-encoded, and a second portion of the trailing segment of the second stream following the first portion of the second stream before decoded and re-encoded. The range of the second portion of the first stream is defined from the edit point to a leading portion of one of the GOPs in which the edit point is set. The range of the first portion of the second stream is defined from the edit to an end of one of the GOPs in which the edit point is set.