The present technology relates to audio coding devices and audio coding methods, audio decoding devices and audio decoding methods, and programs. More particularly the present technology relates to an audio coding device and an audio coding method, an audio decoding device and an audio decoding method, and a program capable of coding audio signals by adaptively-using a higher suitable window function.
As the coding method of audio signal, generally, MP3 (Moving Picture Experts Group Audio Layer-3), AAC (Advanced Audio Coding), ATRAC (Adaptive Transform Acoustic Coding) or the like are well-known as conversion coding methods.
As the audio coding device for coding audio signals, there is known a device which divides an audio signal into plural bands and then performs orthogonal transformation and quantization on a band basis (for example, refer to Japanese Patent No. 2906483).
FIG. 1 is a block diagram showing an example of a configuration of an audio coding device which codes audio signals.
An audio coding device 10 shown in FIG. 1 is configured including a windowing part 11, a frequency converting part 12, a normalization coefficient determining part 13, a normalization coefficient coding part 14, a normalizing part 15, a quantizing part 16, a coding part 17 and a multiplexing part 18.
The audio coding device 10 receives an audio signal T of a PCM (Pulse Code Modulation) signal, which is a piece of frame data T[J] and is segmented into specific sections called as frames. The audio coding device 10 codes the frame data T[J]. J here is an index attached to each of the frames from the front frame in order.
The windowing part 11 of the audio coding device 10 multiplies the input frame data T[J] by a window function WF, and supplies a resultant multiplied data WFT[J] to the frequency converting part 12. The frequency converting part 12 performs a frequency conversion on the multiplied data WFT[J] supplied from the windowing part 11 to obtain a frequency spectrum SP[J]. The frequency converting part 12 supplies the frequency spectrum SP[J] to the normalization coefficient determining part 13 and the normalizing part 15.
The normalization coefficient determining part 13 determines a normalization coefficient SF[J] representing an outline (hereinafter, referred to as envelope) of the frequency spectrum SP[J] based on the frequency spectrum SP[J] supplied from the frequency converting part 12, and supplies the same to the normalization coefficient coding part 14 and the normalizing part 15.
The normalization coefficient coding part 14 calculates a bit number NSF[J] necessary for coding the normalization coefficient SF[J] supplied from the normalization coefficient determining part 13, and supplies the same to the quantizing part 16. Also, the normalization coefficient coding part 14 performs a coding of the normalization coefficient SF[J], and supplies a resultant coded normalization coefficient HSF[J] to the multiplexing part 18.
The normalizing part 15 normalizes the frequency spectrum SP[J] supplied from the frequency converting part 12 by using the normalization coefficient SF[J] supplied from the normalization coefficient determining part 13, and supplies a resultant normalized spectrum NSP[J] to the quantizing part 16.
The quantizing part 16 quantizes the normalized spectrum NSP[J] supplied from the normalizing part 15 based on a piece of quantization information P[J] representing a quantization bit number as a quantization accuracy, and supplies a resultant quantization spectrum QSP[J] to the coding part 17. At this time, the quantizing part 16 obtains a bit number NQSP[J] fed back from the coding part 17 corresponding to the quantization spectrum QSP[J], and adjusts the quantization information P[J] so that the bit number NQSP[J] becomes a predetermined value. The quantizing part 16 supplies the adjusted quantization information P[J] to the multiplexing part 18.
The coding part 17 calculates a bit number NQSP[J] necessary for coding the quantization spectrum QSP[J] supplied from the quantizing part 16. Here, when the bit number NB[J] of a code string B[J], which will be described bellow, is predetermine, the bit number NQSP[J] is necessary to be a value NQ or less in which the bit number NB[J] is subtracted by the bit number NSF[J] relevant to the bit number NP[J] of the quantization information P[J] and the coding of the normalization coefficient SF[J]. Therefore, the coding part 17 feeds the bit number NQSP[J] back to the quantizing part 16, and the quantizing part 16 adjusts the quantization information P[J] so that the bit number NQSP[J] is the value NQ or less. Also, the coding part 17 codes the quantization spectrum QSP[J], and supplies the resultant coded spectrum HSP[J] to the multiplexing part 18.
The multiplexing part 18 multiplexes the coded normalization coefficient HSF[J] from the normalization coefficient coding part 14, the quantization information P[J] from the quantizing part 16 and the coded spectrum HSP[J] from the coding part 17, and transmits the resultant code string B[J].
FIG. 2 is a block diagram showing an example of a configuration of the audio decoding device for decoding the code string B[J] transmitted from the audio coding device 10 shown in FIG. 1.
An audio decoding device 30 shown in FIG. 2 is configured including a decomposing part 31, a decoding part 32, an inverse quantizing part 33, a normalization coefficient decoding part 34, an inverse normalizing part 35, an inverse frequency converting part 36, a windowing part 37 and overlapping part 38.
The decomposing part 31 in the audio decoding device 30 decomposes the code string B[J] transmitted from the audio coding device 10, shown in FIG. 1, into the coded spectrum HSP[J], the quantization information P[J] and the coded normalization coefficient HSF[J]. The decomposing part 31 supplies the coded spectrum HSP[J] to the decoding part 32, the quantization information P[J] to the inverse quantizing part 33, and the coded normalization coefficient HSF[J] to the normalization coefficient decoding part 34.
The decoding part 32 decodes the coded spectrum HSP[J] supplied from the decomposing part 31, and supplies a resultant quantization spectrum QSP[J] to the inverse quantizing part 33. The inverse quantizing part 33 performs an inverse quantization on the quantization spectrum QSP[J] supplied from the decoding part 32 based on the quantization information P[J] supplied from the decomposing part 31 to obtain a normalized spectrum NSP[J]. The inverse quantizing part 33 supplies the normalized spectrum NSP[J] to the inverse normalizing part 35.
The normalization coefficient decoding part 34 decodes the coded normalization coefficient HSF[J] supplied from the decomposing part 31, and supplies a resultant normalization coefficient SF[J] to the inverse normalizing part 35. The inverse normalizing part 35 performs an inverse normalization by using the normalization coefficient SF[J] supplied from the normalization coefficient decoding part 34 and the normalized spectrum NSP[J], and supplies the resultant frequency spectrum SP[J] to the inverse frequency converting part 36.
The inverse frequency converting part 36 performs an inverse frequency conversion on the frequency spectrum SP[J] supplied from the inverse normalizing part 35, and supplies a resultant time axis data ST[J] to the windowing part 37.
The windowing part 37 multiplies the time axis data ST[J] supplied from the inverse frequency converting part 36 by a window function WB. The relationship between the window function WF in the windowing part 11 shown in FIG. 1 and the window function WB has a following restraint condition. That is, when the quantization bit number is infinite (quantization accuracy is infinite), the frame data T[J], which will be described bellow, input to the audio coding device 10 and the frame data T[J] output from the audio decoding device 30 coincide with each other. The windowing part 37 supplies the multiplied data WBT[J] obtained as a result of multiplication to the overlapping part 38.
The overlapping part 38 holds the multiplied data WBT[J] supplied from the windowing part 37. Also, the overlapping part 38 adds the multiplied data WBT[J−1] of a held frame of index J−1 and the multiplied data WBT[J] while overlapping with each other, for example, by a half of one frame. The overlapping part 38 outputs the resultant frame data T[J] as a decoding result. Note that, in order to simplify the description, the frame data as the decoding result is represented with T[J] here, which is the same as the frame data before coding. However, actually, the decoding result and the frame data before coding are not identical.
In the audio coding device 10 shown in FIG. 1, when the ratio of the bit number NSF[J] necessary for coding of the normalization coefficient SF[J] gets larger with respect to the bit number NB[J] of the code string B[J], the bit number NQSP[J] available for the coding of the frequency spectrum SP[J] gets smaller. As a result, the quantization accuracy of the frequency spectrum SP[J] may decrease resulting in a deterioration of sound quality.
Therefore, by reducing the number of the coding frequency spectrums SP[J], the bit number NQSP[J] can be reduced without deteriorating the quantization accuracy of the frequency spectrum SP[J] to thereby prevent the deterioration of sound quality.
When reducing the number of the coding frequency spectrum SP[J], generally high-pass frequency spectrums SP[J] are mainly reduced. In this case, the sound as the decoding result may result in a sound without high-pass elements; i.e., so called boxy sound. Also, it is well known that, when the number of the frequency spectrums SP[J] which are coded on a frame-basis changes, the change may cause a deterioration of sound quality.
On the other hand, it is known that, even when the identical frame data T[J] is input to the audio coding device 10, the bit number NSF[J] which relates to the coding of the normalization coefficient SF[J] and quantization error is changed depending on the configuration of the window function WF.