This invention relates to a method and apparatus for coding audio signals and a method and apparatus for decoding coded audio signals, and is preferably applicable to, for example, recording and reproducing apparatuses that code audio signals in blocks to transmit, record, and reproduce them with video signals.
Conventional methods for coding audio signals in blocks to reduce the amount of data include sub-band coding and conversion coding. For example, the audio coding method called ATRAC (Adaptive Transform Acoustic Coding) used for minidiscs (MD) and the coding method called PASC (Precision Adaptive Sub-band Coding) used for digital compact cassettes (DCC) code DCT (Discrete Cosine Transform) coefficients or band-divided data.
In this manner, conventional audio coding methods use quasi-instantaneous compounding, which is used in the sound standard of MPEG (Moving Pictures Expert Group). Since the level of audio signals varies at a relatively low speed, the quasi-instantaneous compounding divides the signal into blocks each including a specified number of samples and compresses and extends the data on the basis of the blocks.
Video signals, which carry images that have one-to-one correspondence with sounds carried by audio signals, are edited on the basis of frames or fields, but in the audio coding method that uses blocks as the coding unit, the length of coded blocks is determined independently of the number of samples per video signal frame or field.
Thus, when a coded audio signal is transmitted, recorded, or reproduced with a coded video signal, temporal information is added to the video and audio signals to be transmitted to enable decoding synchronization between them, and on reception or reproduction, a system controller in a receiver or a reproducing device decodes these signals based on the temporal information added to the transmitted data.
In this case, however, when an attempt is made to decode the audio signal in synchronism with the video signal, there occurs a period of time when the audio signal cannot be decoded. For example, when an audio signal sampled at 48 [kHz] is coded on the basis of MPEG layer I, which is the MPEG""s sound standard, the signal is converted into an array of blocks each having a length equal to 384 samples. On the other hand, in a 525/59.94 video system (a video system using 525 scanning lines and a field frequency of 59.94 [Hz]), the number of samples included in audio data corresponding to one video frame is 1601 or 1602 if the audio signal is sampled at 48 [kHz].
As a result, when an attempt is made to simultaneously decode the coded video and audio signals, coded audio blocks each extending across two video frames in the video signal result. Thus, if the coded audio signal is decoded after switching on the basis of the frames or fields of the video signal as in editing, decoded data may be missing in coded blocks. before or after the switching point. In the worst case, if a first audio signal having data of 383 samples before a video frame boundary and data of one sample after the boundary is connected to a second audio signal having data of one sample before a video frame boundary and data of 383 samples after the boundary, the audio signal cannot be decoded during the period of time corresponding to the sum of the 383-sample data in the first audio signal and the 383-sample data in the second audio signal (that is, data of the 766 samples) and the period of time corresponding to data of 256 samples before or after the first period of time (data of the 512 samples in total) due to sub-band coding.
This invention proposes a method and apparatus for coding audio signals and a method and apparatus for decoding coded audio signals which involve no period of time in which decoded data is missing, thereby transmitting them even if an audio signal coded in blocks not in synchronism with the frames or fields of a video signal is decoded on the basis of these frames or fields.
To solve the above problem, this invention provides an audio signal coding method for coding an input audio signal in specified data units to form coded audio data separated into coded blocks, wherein the audio signal is block coded in such a way that an integral number of audio coded blocks are filled in the period of time corresponding to one frame or field of the video signal, thereby forming an array of coded blocks in synchronism with the frames or fields of the video signal.
The method according to this invention comprises the steps of blocking and coding an input audio signal in specified data units to form coded blocks separated into blocks, aligning the leading position of one of the coded blocks with a corresponding frame or field boundary in a video signal, and forming an array of coded blocks in synchronism with the frames or fields of the video signal by arranging those coded blocks which follow the coded block the leading position of which has been aligned with the corresponding frame or field boundary in the video signal in such a way that an integral number of coded blocks are filled in the period of time corresponding to one frame or field.
This invention also provides an audio signal encoder for coding an input audio signal in specified data units to form coded audio data separated into coded blocks, comprising a coding means for blocking and coding an input audio signal in specified data units to form coded blocks separated into blocks, a detection means for determining the phase difference between the frame or field boundaries in the video signal and the coded blocks to detect the coded block corresponding to a particular frame or field boundary based on the phase difference, and a memory means for inputting output from the coding means, aligning, based on the results of detection by the detection means, the leading position of the coded block with a corresponding frame or field boundary, and outputting coded blocks in synchronism with the frames or fields of the video signal by arranging those coded blocks which follow the coded block the leading position of which has been aligned with the corresponding frame or field boundary in such a way that an integral number of coded blocks are filled in the period of time corresponding to one frame or field.
The method according to this invention codes an input audio signal using the audio signal coding steps including the steps of blocking and coding an input audio signal in specified data units to form coded blocks separated into blocks, aligning the leading position of one of the coded blocks with a corresponding frame or field boundary in the video signal, forming coded audio data in synchronism with the frames or fields of the video signal by arranging those coded blocks which follow the coded block the leading position of which has been aligned with the corresponding frame or field boundary in the video signal in such a way that an integral number of coded blocks are filled in the period of time corresponding to one frame or field, and adding to the coded audio data, information representing the phase difference between the frame or field boundary in the video signal and the coded block not subjected to alignment of the leading position with the corresponding frame or field boundary, and decodes the coded audio data using coded audio data decoding steps including the steps of detecting relevant phase difference information from the coded audio data including the information on the phase difference and recovering the original phase relationship between the coded block and the video signal based on the detected phase difference information.
This invention includes an audio signal coding section comprising a coding means for blocking and coding an input audio signal in specified data units to form coded blocks separated into blocks, a detection means for determining the phase difference between the frame or field boundaries in the video signal and the coded blocks to detect the coded block corresponding to a particular frame or field boundary based on the phase difference, and a memory means for inputting output from the coding means, aligning, based on the results of detection by the detection means, the leading position of the coded block with a corresponding frame or field boundary, and outputting coded blocks in synchronism with the frames or fields by arranging those coded blocks which follow the coded block the leading position of which has been aligned with the corresponding frame or field boundary in such a way that an integral number of coded blocks are filled in the period of time corresponding to one frame or field, and a phase difference addition means for adding the phase difference detected by the detection means to the corresponding coded block output from the memory means in synchronism with the frames or fields; and a coded audio data decoding section comprising a phase difference information detection means for detecting phase difference information from the coded audio data formed by the audio signal coding section and a memory means for recovering the original phase relationship between the coded block and the video signal based on the detected phase difference information.
Since an integral number of coded blocks are filled in the period of time corresponding to one frame or field in the video signal in order to form an array of coded blocks in synchronism with the frames or fields of the video signal, the audio coded block is not separated at a switching point even if switching such as edition is carried out on the basis of the frames or fields of the video signal. Consequently, the decoding section can decode even coded audio data near the switching point, thereby reducing periods of time in which decoded data is missing.
Since the information on the phase difference between the video signal and the audio coded block detected by the detection means is added to the coded audio data, the decoding section can recover the original phase relationship between the audio coded block and the video signal based on the phase difference information.
Thus, according to this invention, an integral number of coded blocks are filled in the period of time corresponding to one frame or field of the video signal in order to form audio coded blocks in synchronism with each frame or field of the video signal, so most of the coded audio data can be decoded even if switching is executed on the basis of the frames or fields.
In addition, by adding the phase difference information that represents the phase difference between the original coded block and the video signal and which has been used for synchronization to the coded audio data comprising coded audio blocks in synchronism with the frames or fields of the video signal, the phase can be managed easily during decoding to enable the configuration of the decoder to be simplified.