The present invention relates to an audio coding method, an audio coding apparatus, and a data storage medium. More particularly, the present invention relates to an audio coding method and an audio coding apparatus using a subband coding scheme according to an MPEG (Motion Picture Experts Group) standard, and a data storage medium which contains a program for implementing the audio coding method.
In recent years, with spread of a multimedia personal computer or internet, it becomes possible to reproduce a moving picture or audio according to MPEG standard by software on the personal computer (PC), and coded data according to MPEG standard has been widely used.
As an encoder for creating coded data, expensive hardware is commonly used. While the coded data is sometimes created by software, since this coding process requires processing time several times as much as real time necessary for playing back a moving picture or audio, a plenty of time and troubles become necessary, and therefore, this has not been widely spread.
In order to make it possible for a PC user to create coded data at a low cost and with ease, it is required that coded data be created in real time by software processing.
Hereinafter, a description will be given of an example of a conventional audio coding method. FIG. 11 is a block diagram showing an MPEG audio encoder standardized by ISO/IEC11172-3 as a format of coded audio data.
Turning to FIG. 11, subband analysis means 202 divides an input digital audio signal into 32 frequency components, and scale factor calculation means 203 calculates scale factors for respective subband signals and makes dynamic ranges for the respective subband signals uniform. The input digital audio signal is also subjected to an FFT (Fast Fourier Transform) process by FFT means 204. Based on this result, psychoacoustic analysis means 205 derives a relationship model of an SMR (Signal to Mask Ratio) based on a psychoacoustic model utilizing a characteristic of men""s auditory sense. Then, using this model, the bit allocation means 206 determines the number of bits to be allocated to each subband signal. According to the number of bits allocated to each subband signal, quantization/encoding means 207 quantizes/encodes each subband signal. Bit stream creating means 209 creates a bit stream comprising quantized/encoded data from the quantization/encoding means 207 and header information and auxiliary information which have been encoded by auxiliary information encoding means 208, and outputs the bit stream.
In this conventional audio coding method, a coding process is performed for each subband by utilizing the fact that band power is distributed nonuniformly. Therefore, audio quality is determined by bit distribution for each subband signal using the psychoacoustic model. In addition, since the audio coding method has been standardized for the purpose of using a storage medium, it is well suitable for creating high-quality coded data, but is less suitable for a coding process in real time. The psychoacoustic model which determines audio quality requires a large amount of operation.
The conventional audio coding method and audio coding apparatus are so constructed, and are well suitable for creating high-quality coded data for the storage medium, but are less suitable for processing in real time on the PC by software in view of current CPU""s processing ability, because use of the psychoacoustic model requires high processing ability. When operation is performed on the PC on which a high-performance CPU which has capability of real-time processing is mounted, if another application occupies a large part of processing by the CPU, processing cannot be performed in real time. As a consequence, discontinuity of audio might occur.
It is an object of the present invention to provide an audio coding method and an audio coding apparatus, which are capable of creating coded data of high quality and with no discontinuity without being affected by processing ability of a CPU on a personal computer and how much another application occupies processing on the CPU, and a data storage medium which contains a program for implementing this coding process.
Other objects and advantages of the invention will become apparent from the detailed description that follows The detailed description and specific embodiments described are provided only for illustration since various additions and modifications within the spirit and scope of the invention will be apparent to those skill in the art from the detailed description.
According to a 1st aspect of the present invention, in an audio coding method in which a digital audio signal is divided into a plurality of frequency subbands and a coding process is performed for each subband, there are provided plural bit allocation means according to different processing amounts, for generating bit allocation information for each subband, and bit allocation means to be used is changed to perform bit allocation according to external control information such that bit allocation means is selected from the plural bit allocation means and used, whereby the coding process is performed. Therefore, bit allocation means according to an optimum processing amount is always selected and used, and a coding process in which the amount of processing on the CPU which can be occupied by the coding process is not exceeded is realized in an active state. Thereby, when coding the input signal in real time, processing of the input signal will not be delayed. As a result, audio can be reproduced with no discontinuity.
According to a 2nd aspect of the present invention, in the audio coding method of the 1st aspect, a load value indicating a processing amount of a central processing unit which can be occupied by the coding process is used as the external control information, and the bit allocation means is selected such that the processing amount of the central processing unit which can be occupied by the coding process is not exceeded, according to the load value, with reference to a data table which contains respective processing amounts of coding operation by the respective bit allocation means in the coding process on the central processing unit. Therefore, the central processing unit does not accept a request beyond its processing ability, whereby the whole system is controlled smoothly.
According to a 3rd aspect of the present invention, in the audio coding method of the second aspect, processing amount control information from monitoring means for monitoring a processing amount of the central processing unit which can be occupied by the coding process is used as the load value. Therefore, within the highest performance of the central processing unit which can be occupied by the coding process, bit allocation means according to the optimum processing amount is selected. Thereby, when coding the input signal in real time, processing of the signal will not be delayed. As a result, audio can be reproduced with no discontinuity.
According to a 4th aspect of the present invention, in the audio coding method of the 1st aspect, the bit allocation performed by the bit allocation means includes: a process using highly-efficient bit allocation for performing bit allocation with higher efficiency, which realizes high-quality coded data; and a process using low-load bit allocation for performing bit allocation with a lower-load, which performs processing less than the process using highly-efficient bit allocation. Therefore, the encoder carries out a coding process by using processing for higher-quality coded audio data or lower-load processing.
According to a 5th aspect of the present invention, in the audio coding method of the 1st aspect, the bit allocation means to be used in the coding process is changed frame by frame corresponding to a minimum unit decodable into an audio signal. Therefore, in the coding process in real time, when another application which occupies processing on the CPU suddenly increases, the coding process is performed frame by frame according to the amount of processing on the CPU which can be occupied by the coding process. In addition, audio quality or processing amount can be controlled in real time.
According to a 6th aspect of the present invention, in the audio coding method of the 1st aspect, subband signals of the plural frequency subbands into which the digital audio signal is divided are separated into groups each composed of a predetermined number of subband signals continuous in a frequency axis direction, the bit allocation is performed for each of the groups, and the bit allocation information is generated for each subband. Therefore, a bit allocation process adapted to the characteristics of respective subbands is selected, whereby the coding process is performed.
According to a 7th aspect of the present invention, in audio coding method of the 6th aspect, the subband signals are separated into groups variably such that either the number of groups or the number of subband signals continuous in the frequency axis direction in each group are specified according to either the external control information or processing amount control information from monitoring means. Therefore, grouping is conducted dynamically according to the usage state of the CPU.
According to an 8th aspect of the present invention, in the audio coding method of the 7th aspect, the number of subband signals is changed frame by frame corresponding to a minimum unit decodable into an audio signal. Therefore, bit allocation is selected from several alternatives, and thereby an encoder with higher precision is realized.
According to a 9th aspect of the present invention, in the audio coding method of the 8th aspect, when the subband signals are separated into groups, at least one group to which bit allocation is not performed is provided. Since frame by frame corresponding to the minimum unit decodable into the audio signal, the number of groups and the number of subband signals continuous in a frequency axis direction in each group are changed according to external control information or processing amount control information from monitoring means, subband signals in a group to which bit allocation is not performed need not be coded, and therefore, the bits are allocated to subbands in another group to which bit allocation should be performed. As a result, the amount of processing on the CPU which is occupied by the coding process is controlled, and simultaneously, audio quality of subband signals in another group is improved.
According to a 10th aspect of the present invention, in the audio coding method of the 6th aspect, the subband signals are separated into groups, and then subband signals in a low-band group are subjected to highly-efficient bit allocation which realizes high-quality coded data and subband signals in a high-band group are subjected to low-load bit allocation which performs processing less than the highly-efficient bit allocation. Therefore, for the low-band to which men""s ears are highly sensitive, high-quality coded audio data is obtained, while for the high-band to which men""s ears are less sensitive, low-load bit allocation is performed, whereby the coding process is performed while reducing the total processing amount.
According to an 11th aspect of the present invention, in the audio coding method of the 6th aspect, allocatable bit calculation means for determining the number of bits allocatable to bit allocation means for each group is provided, for distributing bits allocatable to all groups such that bits are allocated to bit allocation means for each group, by using a ratio of each group to all groups which has been weighted based on characteristics of respective subbands in each group. Therefore, bits are distributed to bit allocation means for each group which realizes high-quality coded audio data taking psychoacoustic characteristics into account.
According to a 12th aspect of the present invention, in the audio coding method of the 11th aspect, weighting based on characteristics of respective subbands in each group is weighting based on predetermined minimum audible limit values for respective subbands. Therefore, bit allocation effective to men""s hearing is performed.
According to a 13th aspect of the present invention, in the audio coding method of the 11th aspect, weighting based on characteristics of respective subbands in each group is weighting based on subband signal levels of respective frequency subbands in each group obtained by subjecting the input digital audio signal to subband analysis. Thereby, effective bit allocation is performed.
According to a 14th aspect of the present invention, in the audio coding method of the 11th aspect, weighting based on characteristics of respective subbands in each group is weighting based on spectrum signal levels in each group obtained by linearly transforming the input audio signal. Thereby, effective bit allocation is performed.
According to a 15th aspect of the present invention, in the audio coding method of the 6th aspect, signals in a group at levels higher than a predetermined threshold are subjected to highly-efficient bit allocation which realizes high-quality coded audio data, and signals in a group at levels lower than the predetermined threshold are subjected to low-load bit allocation which performs processing less than the highly efficient bit allocation. Since less significant subband signals are subjected to the low-load processing, higher-quality coded data is achieved.
According to a 16th aspect of the present invention, in the audio coding method of the 15th aspect, the levels of signals in each group are levels of subband signals obtained by subjecting the input digital audio signal to subband analysis. Thereby, effective bit allocation is performed.
According to a 17th aspect of the present invention, in the audio coding method of the 15th aspect, the levels of signals in each group are levels of spectrum signals obtained by linearly transforming the input digital audio signal. Thereby, effective bit allocation is performed.
According to an 18th aspect of the present invention, in the audio coding method of the 15th aspect, the levels of signals in each group are predetermined minimum audible limit values for respective subbands. Therefore, bit allocation effective to men""s hearing is performed.
According to a 19th aspect of the present invention, in the audio coding method of the 4th, 10th, and 15th aspects, the process using the highly-efficient bit allocation is performed according to a relationship of a signal to mask ratio based on a predetermined psychoacoustic model, and the process using the low-load bit allocation is performed by adding predetermined minimum audible limit values for respective subbands to signal levels of plural frequency subbands. Therefore, the processing amount of the system can be reduced without degrading audio quality.
According to a 20th aspect of the present invention, in the audio coding method of the 19th aspect, the psychoacoustic model is a psychoacoustic model specified according to an MPEG (Motion Picture Experts Group) standard. Therefore, the same effects as described above are obtained in the audio coding process according to MPEG standard.
According to a 21st aspect of the present invention, in the audio coding method of the 5th or 8th aspect, the frame corresponding to the minimum unit which is decodable into the audio signal is a frame specified according to an MPEG standard. Therefore, the same effects as described above are obtained in the audio coding process according to MPEG standard.
According to a 22nd aspect of the present invention, in the audio coding method of the 1st aspect, the bit allocation means generates bit allocation information for each subband according to information output from a predetermined psychoacoustic model, generates the bit allocation information according to the information output from the predetermined psychoacoustic model every N (N=1, 2, 3 . . . ) frames, and generates the bit allocation information for frames for which the bit allocation information is not generated, according to the information output from the psychoacoustic model and signal information of the respective subbands. Therefore, the load on the CPU in the time axis direction is reduced.
According to a 23rd aspect of the present invention, in the audio coding method of the 1st aspect, a psychoacoustic model which is capable of controlling a processing amount stepwise is provided, the processing amount of the psychoacoustic model is controlled according to the external control information, and bit allocation information for each subband is generated so that processing is performed by the use of the psychoacoustic model according to a predetermined processing amount. Therefore, the load on the CPU is controlled by using psychoacoustic effects.
According to a 24th aspect of the present invention, in the audio coding method of the 1st aspect, plural psychoacoustic models according to different processing amounts are provided, and a psychoacoustic model to be used is changed to generate bit allocation information for each subband according to the external control information such that a psychoacoustic model is selected from the plural psychoacoustic models and used to perform processing. Therefore, the load on the CPU is controlled by using psychoacoustic effects with ease.
According to a 25th aspect of the present invention, in an audio coding method in which a digital audio signal is divided into plural frequency subbands, bit allocation is generated for each subband, and a coding process is performed for each subband to make transmission at a given bit rate, a range of bit allocation for a frame in which data is inserted into a coded data stream is controlled, and thereby the amount of coded audio data is controlled variably. Therefore, effective use of a band is realized by using various data for surplus subbands.
According to a 26th aspect of the present invention, in the audio coding method of the 25th aspect, the range of bit allocation is controlled frame by frame according to external control information, and thereby the amount of coded audio data is controlled variably. Therefore, the load on the CPU can be reduced effectively.
According to a 27th aspect of the present invention, in the audio coding method of the 26th aspect, data amount control information from means for monitoring a buffer for storing data to be added is used as the external control information. Therefore, data to-be-added can be used with priority.
According to a 28th aspect of the present invention, in the audio coding method of the 1st aspect, load value information of respective processing of either the plural bit allocation means or plural psychoacoustic models is output externally, according to performance of a central processing unit on which the coding process is performed, at initialization prior to the coding process. Therefore, information relating to performance of the central processing unit to be used is obtained prior to the coding process, whereby the load on the CPU can be reduced effectively.
According to a 29th aspect of the present invention, in the audio coding method of the 28th aspect, the load value information is output externally in ascending or descending order. Therefore, the coding means can be selected quickly.
According to a 30th aspect of the present invention, in an audio coding method in which a video signal and an audio signal are coded by the same central processing unit, a coding process is performed according to plural different operation amounts, and a coding amount of either the audio signal or the video signal is changed, and thereby the total operation amount of processing on the central processing unit is controlled. Therefore, in the process for both the audio signal and the video signal, processing associated with the load on the CPU is conducted.
According to a 31st aspect of the present invention, in an audio coding method in which a video signal and an audio signal are coded by the same central processing unit, a coding process is performed using plural coding schemes according to different operation amounts, and a coding scheme for coding the audio signal is changed, and thereby the total operation amount of processing on the central processing unit is controlled. Therefore, in the process for both the audio signal and the video signal, processing associated with the load on the CPU is conducted.
According to a 32nd aspect of the present invention, in the audio coding method of the 30th or 31st aspect, the processing on the central processing unit is controlled according to external control information. Therefore, the load on the CPU can be effectively reduced.
According to a 33rd aspect of the present invention, in an audio coding method in which a digital audio signal is subjected to time/frequency transformation, to generate quantization information, and thereby a coding process is performed, there are provided plural quantization information calculation means according to different operation amounts, and quantization information calculation means to be used is changed to generate quantization information according to external control information such that quantization information calculation means is selected from the plural quantization information calculation means and used. Therefore, in the coding apparatus which performs time/frequency transformation, the load placed on the CPU can be reduced.
According to a 34th aspect of the present invention, there is provided an audio coding apparatus which performs an audio coding process by using the audio coding method of the 1st to 33rd aspects. Therefore, in equipment such as a VTR camera which incorporates the audio coding method, the effects as described above are obtained.
According to a 35th aspect of the present invention, there is provided a data storage medium for storing steps of the audio coding method of the 1st to 33rd aspects. Therefore, the audio coding method is incorporated by the use of the data storage medium, whereby the same effects described above are obtained.