1. Field of the Invention
The present invention relates to a method and an apparatus, employing a digital moving picture-audio compressing technology standards (Moving Picture Experts Group: hereinafter abbreviated as MPEG), for dividing and editing an MPEG-2 transport stream data formed by time-shared multiplexing of encoded digital moving video data and digital audio data.
2. Related Background Art
In the MPEG standard defined as a known technology in IEC/ISO 13818, an MPEG-2TS process is being employed as a data process for transmitting video signals and audio signals in satellite digital broadcasting and ground digital broadcasting in Japan, U.S.A. and European countries.
By recording the data compressed by such MPEG-2TS process in a digital state in a recording medium enabling random access such as a hard disk, an optical disk or a semiconductor memory capable of high-speed recording-reproduction thereby storing such data as a data file accessible to the user, it is rendered possible to repeatedly view an AV program of a high quality at any time or to achieve a random access reproduction or a program editing of a high freedom, without any deterioration in the quality of the video and the audio.
FIG. 6 shows a structure of MPEG-2TS data recorded on a recording medium. The MPEG-2TS data are formed by a TS packet of a fixed size of 188 bytes, which is constituted of a header information of 4 bytes and a payload information portion of 184 bytes having an actual AV information.
In the header information of the TS packet, there is provided an identifier (packet ID: hereinafter called PID) for identifying whether the payload information of the TS packet succeeding to the header information is video data or audio data. Also in the header information, there is provided an information bit (unit start indicator) for indicating whether new PES packet data are started in the payload information. A unit start indicator “1” indicates that a new PES packet is started, and otherwise indicates that the ensuing payload data are a continued part of PES packet data.
Also, as special information of the TS packet, there is defined a program map table (PMT) for managing map information of data constituting the stream, and the PID is uniquely defined for a TS packet having a video signal and a TS packet having an audio signal. Such MPEG-2TS technology is described in detail for example in Hiroshi Fujiwara, ISOIEC 13818 series, Point Zukaishiki Saishin MPEG Kyokasho, published by Nippon Denki Kogyokai, edited by ASCII Publishing (Aug. 1, 1994) and in All of video and audio compression technology, Interface additional edit., edited by Hiroshi Fujiwara (Apr. 1, 2000).
In the following, there will be explained an editing process in case of dividing, into two, MPEG-2TS data having a data structure as shown in FIG. 6 and recorded on a random accessible recording medium. FIG. 7 illustrates such dividing editing process.
In FIGS. 6 and 7, V indicates TS packet data having video information, and A indicates TS packet data having audio information. The TS packets having the video information include a white-boxed V and a hatched V, in which the hatched V indicates video information TS packet data including a GOP start code. Also the TS packets having the audio information include a white-boxed A and a hatched A, in which the hatched A indicates audio information TS packet data including an audio frame start code. A suffix to each packet is a packet number indicating a timing of synchronized reproduction of video information and audio information. For example a video information TS packet V0 and an audio information packet A0 are reproduced in synchronization.
A timing of multiplexing a video signal and an audio signal is ordinarily determined by a function of an encoding apparatus for the audio signal and the video signal and a decoding rule of the MPEG standard. As shown in FIGS. 6 and 7, the video information packet and the audio information packet to be synchronously reproduced are stored in physically distant locations, so that the TS packetized video signal of an 1GOP mostly contains a TS packetized audio frame signal belonging to another GOP.
In case of a demand from the user to divide the MPEG-2TS data, recorded on the recording medium in the above-described method, in a position DIV at a boundary of the GOP units as shown in FIG. 7, there are generated, by a dividing editing, a stream from the head to the dividing position and a stream from the dividing position to the end.
In case of dividing such MPEG-2TS data at the dividing position DIV shown in FIG. 7, since the TS packet having the audio information is present in the vicinity of the dividing position within the TS packet having the video information of 1 GOP, a stream prepared as the data from the head of the steam to the dividing position includes incomplete audio data in which the last data having the audio information does not satisfy the data structure of an audio frame as a minimum decoding unit, as shown in FIG. 7. Also a stream prepared as the data from the dividing position to the stream end includes incomplete audio data in which the initial data having the audio information does not satisfy the data structure of an audio frame as a minimum decoding unit, as shown in FIG. 7.
On the other hand, a system for recording and editing a video signal and an audio signal, compression encoded in-the MPEG method, in a TS packet state is associated with following drawbacks. The video signal is MPEG compression encoded in the unit of a GOP, but may not be compression encoded with a fixed bit rate as the length of the image data of a frame is variable depending on the picture type such as an I picture, a P picture or a B picture or on the picture pattern.
On the other hand, as the audio signal is compressed with a fixed rate, the packets of the corresponding video and audio signals may be located in physically distant positions on the TS data, even when such corresponding video and audio signals are simultaneously encoded. In case the packets of the corresponding video and audio signals are located in physically distant positions on the TS data, an editing operation of dividing the multiplexed TS data in the middle thereof, there may result drawbacks such as an aberration between the timings of connection of the video signal and connection of the audio signal, or a deficiency in either data.
Therefore the prior editing method explained in FIG. 7 generates a discontinuity in the audio frame structure in the vicinity of the dividing position, thus resulting in an abnormal noise giving an unpleasant feeling to the user or in a soundless state.