Adaptive encoding that encodes orthogonal coefficients such as DFT (Discrete Fourier Transform) and MDCT (Modified Discrete Cosine Transform) coefficients is known as a method for encoding speech signals and audio signals at low bit rates (for example about 10 to 20 Kbits/s). For example, AMR-WB+ (Extended Adaptive Multi-Rate Wideband), which is a standard technique, has the TCX (transform coded excitation) encoding mode. In the TCX encoding, gain is determined for a coefficient string obtained by normalizing an audio digital signal sequence in the frequency domain with a power spectrum envelope coefficient string so that a sequence obtained by dividing each of the coefficient in the coefficient string by the gain can be encoded with a predetermined number of bits.
<TCX Encoder 1000>
FIG. 1 illustrates an exemplary configuration of an encoder 1000 that performs conventional TCX encoding. Components in FIG. 1 will be described below.
<Frequency-Domain Transformer 1001>
A frequency-domain transformer 1001 transforms an input audio digital signal to an MDCT coefficient string X(1), . . . , X(N) at N points in the frequency domain on a frame-by-frame basis in a given time period and outputs the MDCT coefficient string. Here, N is a positive integer.
<Power-Spectrum-Envelope-Coefficient-String Arithmetic Unit 1002>
A power-spectrum-envelope-coefficient-string arithmetic unit 1002 performs linear prediction analysis of an audio digital signal in each frame to obtain liner predictive coefficients and uses the linear predictive coefficients to obtain and output a power spectrum envelope coefficient string W(1), . . . , W(N) of the audio digital signal at N points.
<Weighted Envelope Normalizer 1003>
A weighted envelope normalizer 1003 uses a power spectrum envelope coefficient string obtained by the power-spectrum-envelope-coefficient-string arithmetic unit 1002 to normalize each of the coefficients in an MDCT coefficient string obtained by the frequency-domain transformer 1001 and outputs a weighted normalized MDCT coefficient string XN(1), . . . , XN(N). Here, in order to achieve quantization that auditorily minimizes distortion, the weighted envelope normalizer 1003 uses a weighted power spectrum envelope coefficient string obtained by moderating a power spectrum envelope to normalize the coefficients in the MDCT coefficient strings on a frame-by-frame basis. As a result, the weighted normalized MDCT coefficient string XN(1), . . . , XN(N) does not have a steep slope of amplitude or large variations in amplitude as compared with the input MDCT coefficient string but has variations in magnitude similar to those of the power spectrum envelope coefficient string of the audio digital signal. That is, the weighted normalized MDCT coefficient string has somewhat greater amplitudes in a region of coefficients corresponding to low frequencies and has a fine structure due to a pitch period.
<Initializer 1004>
An initializer 1004 sets an initial value of gain (global gain) g. The initial value of the gain can be determined from the energy of a weighted normalized MDCT coefficient string XN(1), . . . , XN(N) and the number of bits allocated beforehand to an encode output from a variable-length encoder 1006, for example. The number of bits allocated beforehand to a code output from the variable-length encoder 1006 is hereinafter referred to as the number B of allocated bits. The initializer also sets 0 as the initial value of the number of updates of gain.
<Gain Update Loop Processor 1130>
A gain update loop processor 1130 determines gain such that a sequence obtained by dividing each coefficient in a weighted normalized MDCT coefficient string XN(1), . . . , XN(N) by the gain can be encoded with a predetermined number of bits, and outputs an integer signal code obtained by variable length encoding of the sequence obtained by dividing each coefficient in the weighted normalized MDCT coefficient string XN(1), . . . , XN(N) by the determined gain and a gain code obtained by encoding the determined gain.
The update loop processor 1130 includes a quantizer 1005, the variable-length encoder 1006, a determiner 1007, a gain expansion updater 1131, a gain reduction updater 1132, a truncation unit 1016, and a gain encoder 1017.
<Quantizer 1005>
The quantizer 1005 quantizes a value obtained by dividing each coefficient in a weighted normalized MDCT coefficient string XN(1), . . . , XN(N) by gain g to obtain and output a quantized normalized coefficient sequence XQ(1), . . . , XQ(N), which is a sequence of integer values.
<Variable-Length Encoder 1006>
The variable-length encoder 1006 encodes a quantized normalized coefficient sequence XQ(1), . . . , XQ(N) to obtain and output a code. The code is referred to as integer signal code. The variable-length encoding may use a method that encodes a plurality of coefficients in a quantized normalized coefficient string at a time, for example. In addition, the variable-length encoder 1006 measures the number of bits in the integer signal code obtained by the variable-length encoding. The number of bits is hereinafter referred to as the number c of consumed bits.
<Determiner 1007>
The determiner 1007 outputs gain, integer signal code, and the number c of consumed bits when the number of updates of gain is equal to a predetermined number.
When the number of updates of gain is less than the predetermined number, the determiner 1007 performs control to cause a gain expansion updater 1131 to perform a next process if the number c of consumed bits measured by the variable-length encoder 1006 is greater than the number B of allocated bits, or to cause a gain reduction updater 1132 to perform a next process if the number c of consumed bits measured by the variable-length encoder 1006 is smaller than the number B of allocated bits. Note that if the number c of consumed bits is equal to the number B of allocated bits, it means that the current value of gain is optimum and therefore the determiner 1007 outputs the gain, the integer signal code and the number c of consumed bits.
<Gain Expansion Updater 1131>
The gain expansion updater 1131 sets a value greater than the current value of gain g as new gain g′>g. The gain expansion updater 1131 includes a lower limit gain setter 1008, a first branch controller 1009, a first gain updater 1010, and a gain expander 1011.
<Lower Limit Gain Setter 1008>
The lower limit gain setter 1008 sets the current value of gain g as the lower limit gain gmin (gmin←g). The lower limit gain gmin means the lowest value of gain allowed.
<First Branch Controller 1009>
When the lower limit gain gmin is set by the lower limit gain setter 1008, the first branch controller 1009 performs control to cause the first gain updater 1010 to perform a next process if an upper limit gain value gmax has been already set or to cause the gain expander 1011 to perform a next process if the upper limit gain gmax has not been set.
<First Gain Updater 1010>
The first gain updater 1010 sets the average of the current value of gain g and the upper limit gain gmax as a new value of gain g (g←(g+gmax)/2). This is because an optimum value of gain is between the current value of gain g and the upper limit gain gmax. Since the current value of gain g has been set as the lower limit gain gmin, it can be said that the average of the upper limit gain gmax and the lower limit gain gmin is set as a new value of gain g (g←(gmax+gmin)/2). Then the control returns to the process in the quantizer 1005.
<Gain Expander 1011>
The gain expander 1011 sets a value greater than the current value of gain g as a new value of gain g. For example, the gain expander 1011 sets a value that is equal to the current value of gain g plus a gain change amount Δg, which is a predetermined value, as a new value of gain g (g←g+Δg). If the upper limit gain gmax has not been set and the number c of consumed bits has been greater than the number B of allocated bits successive times, for example, a value greater than the predetermined value is used as the gain change amount Δg. Then the control returns to the process in the quantizer 1005.
<Gain Reduction Updater 1132>
The gain reduction updater 1132 sets a value smaller than the current value of gain g as a new gain g′<g. The gain reduction updater 1132 includes an upper limit gain setter 1012, a second branch controller 1013, a second gain updater 1014, and a gain reducer 1015.
<Upper Limit Gain Setter 1012>
The upper limit gain setter 1012 sets the current value of gain g as the upper limit gain gmax (gmax←g). The upper limit gain gmax means the highest gain allowed.
<Second Branch Controller 1013>
When the upper limit gain gmax is set by the upper limit gain setter 1012, the second branch controller 1013 performs control to cause the second gain updater 1014 to perform a next process if the lower limit gain gmin has already been set or to cause the gain reducer 1015 to perform a next process if the lower limit gain gmin has not yet been set.
<Second Gain Updater 1014>
The second gain updater 1014 sets the average of the current the current value of gain g and the lower limit gain gmin as a new value of gain g (g←(g+gmin)/2). This is because an optimum gain value is between the current value of gain g and the lower limit gain gmin. Since the current value of gain g has been set as the upper limit gain gmax, it can be said that the average of the upper limit gain gmax and the lower limit gain gmin is set as a new value of gain g (g←(gmax+gmin)/2). Then the control returns to the process in the quantizer 1005.
<Gain Reducer 1015>
The gain reducer 1015 sets a value smaller than the current value of gain g as a new value of gain g. For example, the gain reducer 1015 sets a value equal to the current value of gain g minus a gain change amount Δg, which is a predetermined value, as a new value of gain g (g←g−Δg). If the lower limit gain gmin has not been set and the number c of consumed bits has been smaller than the number B of allocated bits successive times, for example, a value greater than the predetermined value is used as the gain change amount Δg. Then the control returns to the process in the quantizer 1005.
<Truncation Unit 1016>
When the number c of consumed bits output from the determiner 1007 is greater than the number B of allocated bits, the truncation unit 1016 removes an amount of code equivalent to bits by which the number c of consumed bits exceeds the number B of allocated bits from the code corresponding to quantized normalized coefficients at the high frequency side in an integer signal code output from the determiner 1007 and outputs the resulting code as a new integer signal code. That is, the truncation unit 1016 removes the amount of code equivalent to the number of bits c−B by which the number c of consumed bits exceeds the number B of allocated bits that corresponds to quantized normalized coefficients at the high frequency side from the integer signal code and outputs the remaining code as a new integer signal code.
<Gain Encoder 1017>
The gain encoder 1017 encodes gain output from the determiner 1007 with a predetermined number of bits to obtain and output a gain code.