The present invention relates to a method and an apparatus for editing audio data.
In recent days, video data may be treated by a home computer because a lower price of a secondary storage device and a lower compressing rate of video data caused by the MPEG (Moving Picture Experts Group) that is the de facto international standard of the technique of compressing video data.
The MPEG is the international standard about compression of a moving picture established by the ISO (International Organization for Standardization). At first, the MPEG-1 is made public and then the MPEG-2 is established. The MPEG-2 is the compressing standard for broadcasting. The MPEG-1 is the technique of transferring a picture at a rate of about 1-5 Mbps and regenerating the transferred picture at a resolution of about 352xc3x97240 pixels and at a rate of about 30 frames per second (for the NTSC) or about 24 frames (for the PAL). It is widely known that the picture quality of the decoded MPEG-1 data corresponds to the quality of the VHS type video cassette. On the contrary, the MPEG-2 is the technique of regenerating a picture consisting of about 720xc3x97490 pixels at a transfer rate of 4.0 to 8.0 Mbps. Compared with the quality of the MPEG-1, it is understood that the picture quality of the MPEG-2 corresponds to the quality of the LD (Laser Disk).
Normally, the MPEG data is generated by encoding (compressing) the analog moving picture inputted by a camera or a capture board in the MPEG format. The captured MPEG data may be regenerated by the personal computer in which the MPEG decoder (in the form of software or hardware) is installed.
The MPEG data is formed of an MPEG system stream composed by multiplexing an MPEG video stream that is the compressed video data and an MPEG audio stream that is the compressed audio data. The data normally called the MPEG data is the MPEG system stream. Only the MPEG video stream or the MPEG audio stream may be regenerated by a software implemented decoder or the like.
The MPEG normally has a picture rate (frames per one second) of 30, in which case the regenerating time length of the video data consisting of 900 frames is 30 sections. Hence, in the case of 30 frames per second, the regenerating time length of one frame is about 33 ms. On the other hand, the MPEG audio data is divided into three layers, that is, the layer 1, the layer 2 and the layer 3, whose sampling frequencies are 32 KHz, 44.1 KHz and 48 KHz, respectively. Further, the AAU (Audio Access Unit) that is a compression unit of the MPEG audio data has 384 samples for the layer 1 or 1152 samples for the layer 2 and the layer 3.
Like the normal uncompressed data, the MPEG data may be used as is or subject to some treatments such as partial deletion and effective paste of data pieces. If the video data piece is pasted with the audio data piece, it is necessary to synchronize both of the data pieces with each other. In practice, however, both of the data pieces often have the different lengths. It disadvantageously brings about a lag between the video data piece and the audio data piece.
This disadvantage will be described below with reference to FIG. 2. The BGM (audio data) B of 20 seconds is pasted with the frame of the video data A. The pasted data is then pasted with the video data C of 900 frames (30 seconds) and the audio data D of 30 seconds. As is clearly shown, some lag takes place between the start edges of the video data C and the audio data D. Further, the video data C and the audio data D are then pasted with the video data E and the audio data F each having the same regenerating time length as the video data C and the audio data D. In this case, the lag of synchronization is continued as well.
Hence, the technique of overcoming this lag of synchronization is described in JP-A-09-37204. This technique is arranged to separate the compressed data into the compressed moving picture data and the compressed audio data and compare both of the data with each other at regenerating time. If the audio data has a shorter regenerating time than the moving picture data, the prepared silent PCM data is compressed for generating the silent compressed audio data extending for a necessary length of time and is pasted with the audio data. Then, the moving picture data and the audio data are synthesized with each other.
However, the foregoing technique requires compressing the silent PCM data. Hence, if the silent length to be added extends for a long time, it disadvantageously takes a considerable time to compress the PCM data.
Moreover, while the MPEG system stream is created by an encoder, the moving picture data is continuously inputted into the encoder. However, the sound may be discontinued or the audio data may be also paused by a mute function. In such a case, the encoder operates to compress the silent audio data for creating the moving picture data. Like the above, if the silent time continues for a considerably long time, disadvantageously, it also takes a long time to compress the data.
It is an object of the present invention to create dummy audio data that does not need the compressing process.
It is a further object of the present invention to adjust the regenerating time length of the audio data with the dummy audio data that does not need the compression and to solve a lag of synchronization between the video data and the audio data.
According to the invention, an editing method and an editing system are disclosed for creating dummy audio data without compression consisting of a header portion containing at least information (e.g., syncword) for indicating a start of an audio decode unit (e.g., AAU) and dummy data that is to be ignored in decoding. The retrieval of the next header portion is started without decoding this dummy audio data. As a result, the silent interval is continued for a length of time when the next data is being retrieved.
According to an aspect of the invention, when the video data is pasted with the audio data, if the audio data has a shorter regenerating time length than the video data, the audio data is composed by synthesizing a header portion that corresponds to the header information extracted from the audio data with the dummy data that is to be ignored by a regenerating device side. The composed audio data corresponding to a shortage time of the audio data in the MPEG audio stream is added to the MPEG audio stream.
In the process of creating the moving picture and audio data as capturing the video data and the audio data, if the audio data is silent, the process is executed to create dummy audio data consisting of the header of the compressed audio data and the dummy data and to synthesize the dummy audio data with the video data for creating the moving picture and audio data.
The present invention offers numerous effects, the particularly great effect of which is no necessary compression of the audio data when creating the dummy audio data. When the audio data is pasted with the video data, if the audio data is shorter than the video data, the regenerating time length of the audio data can be adjusted in a short time. Further, the process of creating the moving picture and audio data with a silent portion is shortened.