An adaptive encoding for orthogonal transform coefficients of transform such as DFT (discrete Fourier transform) and MDCT (modified discrete cosine transform) is known as low-bit-rate (for example on the order of 10K bits/s to 20K bits/s) encoding for speech signals and audio signals. For example, AMR-WB+ (Extended Adaptive Multi-Rate Wideband), which is a standard technique described in Non-Patent Literature 1, has TCX (transform coded excitation) encoding modes. In the TCX encoding, a gain is decided for a coefficient string which is obtained by normalizing a frequency-domain audio signal sequence by using a power spectral envelope sequence so that a sequence that is obtained by dividing each coefficient in the coefficient string by the gain can be encoded with a predetermined number of bits, thereby allowing encoding with a total number of bits allocated to each frame.
<Encoder 500>
FIG. 1 illustrates an exemplary configuration of a conventional encoder 500 for TCX encoding. The components illustrated in FIG. 1 will be described below.
<Frequency-Domain Transformer 5001>
A frequency-domain transformer 5001 transforms an input time-domain speech/audio digital signal (hereinafter referred to as an input audio signal) in each frame, which is a predetermined time interval, into a MDCT coefficient string X(1), . . . , X(N) at N points in the frequency domain and outputs the MDCT coefficient string. Here, N is a positive integer.
<Power-Spectral Envelope-Sequence Arithmetic Unit 5002>
A power-spectral envelope-sequence arithmetic unit 5002 performs linear predication analysis of an input audio signal on a frame-by-frame basis to obtain linear predictive coefficients and uses the linear predictive coefficients to obtain and output a power spectral envelope sequence W(1), . . . , W(N) of the input audio signal at N points. The linear predictive coefficients are encoded using a conventional encoding technique and the resulting predictive coefficient code is transmitted to a decoding side.
<Weighted Envelope Normalizer 5003>
A weighted envelope normalizer 5003 uses each of the values in a power spectral envelope sequence W(1), . . . , W(N) obtained by the power-spectral envelope-sequence arithmetic unit 5002 to normalize the value of each of the coefficients X(1), . . . , X(N) in an MDCT coefficient string obtained by the frequency-domain transformer 5001 and outputs a weighted normalized MDCT coefficient string XN(1), . . . , XN(N). In order to achieve quantization that auditorily minimizes distortion, the weighted envelope normalizer 5003 uses a weighted power spectral envelope sequence produced by smoothing the power spectral envelope to normalize each coefficient in the MDCT coefficient string in each frame. Consequently, the weighted normalized MDCT coefficient string XN(1), . . . , XN(N) has a smaller slope of amplitude and fluctuations of amplitude than the input MDCT coefficient string X(1), . . . , X(N) but has magnitude variations similar to those of the power spectral envelope sequence of the input audio signal, that is, has slightly greater amplitudes in a region of coefficients corresponding to low frequencies and has a fine structure due to a pitch period.
<Gain Adjustment Encoder 5100>
A gain adjustment encoder 5100 divides each of the coefficients in an input weighted normalized MDCT coefficient string XN(1), . . . , XN(N) by a gain g and outputs a gain code corresponding to the gain g such that the number of bits of an integer signal code that is obtained by encoding a quantized normalized coefficient sequence XQ(1), . . . , XQ(N), which is a sequence of integer values obtained by quantizing the result of the division, is smaller than or equal to the number B of allocated bits, which is the number of bits allocated in advance, and as large as possible, and also outputs the integer signal code.
The gain adjustment encoder 5100 comprises an initializer 5104, a frequency-domain-sequence quantizer 5105, a variable-length encoder 5106, a determiner 5107, a minimum gain setter 5108, a first branching unit 5109, a first gain updater 5110, a gain increaser 5111, a maximum gain setter 5112, a second branching unit 5113, a second gain updater 5114, a gain reducer 5115, a truncation unit 5116 and a gain encoder 5117.
<Initializer 5104>
The initializer 5104 sets an initial value of the gain g. The initial value of the gain can be decided from factors such as the energy of a weighted normalized MDCT coefficient string XN(1), . . . , XN(N) and the number of bits allocated in advance to a code output from the variable-length encoder 5106. The number of bits allocated in advance to a code output from the variable-length encoder 5106 will be hereinafter referred to as the number B of allocated bits. The initializer 5104 also sets 0 as the initial value of the number of updates of gain.
<Frequency-Domain-Sequence Quantizer 5105>
The frequency-domain-sequence quantizer 5105 quantizes values that is obtained by dividing each of the coefficients in a weighted normalized MDCT coefficient string XN(1), . . . , XN(N) by the gain g to obtain and output a quantized normalized coefficient sequence XQ(1), . . . XQ(N), which is a sequence of integer values.
<Variable-Length Encoder 5106>
The variable-length encoder 5106 encodes an input quantized normalized coefficient sequence XQ(1), . . . , XQ(N) by using variable-length encoding to obtain and outputs a code. The code will be referred to as an integer signal code. The variable-length encoding may use a method that encodes a plurality of coefficients in the quantized normalized coefficient sequence together, for example. The variable-length encoder 5106 measures the number of bits of the integer signal code obtained as a result of the variable-length encoding. The number of bits will be hereinafter referred to as the number c of consumed bits.
<Determiner 5107>
When the number of updates of the gain is equal to a predetermined number or when the number c of consumed bits measured by the variable-length encoder 5106 is equal to the number B of allocated bits, the determiner 5107 outputs a gain, an integer signal code and the number c of consumed bits.
When the number of updates of the gain is smaller than the predetermined number of updates and the number c of consumed bits measured by the variable-length encoder 5106 is greater than the number B of allocated bits, the determiner 5107 performs control to cause a minimum gain setter 5108 to perform the next process; when the number of updates of the gain is smaller than the predetermined number of updates and the number c of consumed bits measured by the variable-length encoder 5106 is less than the number B of allocated bits, the determiner 5107 performs control to cause a maximum gain setter 5112 to perform the next process.
<Minimum Gain Setter 5108>
The minimum gain setter 5108 sets the current value of the gain g as the lower bound gmin of the gain (gmin←g). The lower bound gmin of the gain represents the minimum allowable value of the gain.
<First Branching Unit 5109>
When an upper bound gmax of the gain has already been set, a first branching unit 5109 performs control to cause a first gain updater 5110 to perform the next process; otherwise, the first branching unit 5109 performs control to cause a gain increaser 5111 to perform the next process. Further, the first branching unit 5109 adds 1 to the number of updates of gain.
<First Gain Updater 5110>
The first gain updater 5110 sets the average between the current value of the gain g and the upper bound gmax of the gain as a new value of the gain g (g←(g+gmax)/2). This is because an optimum value of the gain is between the current value of the gain g and the upper bound gmax of the gain. Since the current value of the gain g has been set as the lower bound gmin of the gain, it can be also said that the average between the upper bound gmax of the gain and the lower bound gmin of the gain is set as a new value of the gain g (g←(gmax+gmin)/2). The set new gain g is input into the frequency-domain-sequence quantizer 5105.
<Gain Increaser 5111>
The gain increaser 5111 sets a value greater than the current value of the gain g as a new value of the gain g. For example, the gain increaser 5111 sets the current value of the gain g plus an amount Δg by which the gain is to be changed, which is a predetermined positive value, as a new value of the gain g (g←g+Δg). Further, when it is found a plurality of successive times that the number c of consumed bits is greater than the number B of allocated bits without the upper bound gmax of the gain being set, the gain increaser 5111 uses a value greater than the predetermined value as the amount Δg by which the gain is to be changed. The set new gain g is input into the frequency-domain-sequence quantizer 5105.
<Maximum Gain Setter 5112>
The maximum gain setter 5112 sets the current value of the gain g as the upper bound gmax of the gain (gmax←g). The upper bound gmax of the gain represents the maximum allowable value of the gain.
<Second Branching Unit 5113>
When the lower bound gmin of the gain has already been set, the second branching unit 5113 performs control to cause the second gain updater 5114 to perform the next process; otherwise, the second branching unit 5113 performs control to cause the gain reducer 5115 to perform the next process. Further, the second branching unit 5113 adds 1 to the number of updates of gain.
<Second Gain Updater 5114>
The second gain updater 5114 sets the average between the current value of the gain g and the lower bound gmin of the gain as a new value of the gain g (g←(g+gmin)/2). This is because an optimum gain value is between the current value of the gain g and the lower bound gmin of the gain. Since the current value of the gain g has been set as the upper bound gmax of the gain, it can be also said that the average between the upper bound gmax of the gain and the lower bound gmin of the gain is set as a new value of the gain g (g←(gmax+gmin)/2). The set new gain g is input into the frequency-domain-sequence quantizer 5105.
<Gain Reducer 5115>
The gain reducer 5115 sets a value smaller than the current value of the gain g as a new value of the gain g. For example, the gain reducer 5115 sets the current value of the gain g minus an amount Δg by which gain is to be changed, which is a predetermined positive value, as a new value of the gain g (g←g−Δg). Further, for example, when it is found a plurality of successive times that the number c of consumed bits is smaller than the number B of allocated bits without lower bound gmin of the gain being set, the gain reducer 5115 uses a value greater than the predetermined value as the amount Δg by which the gain is to be changed. The set new gain g is input into the frequency-domain-sequence quantizer 5105.
<Truncation Unit 5116>
When the number c of consumed bits output from the determiner 5107 is greater than the number B of allocated bits, the truncation unit 5116 removes the amount of code equivalent to the bits by which the number c of consumed bits exceeds the number B of allocated bits from the code corresponding to quantized normalized coefficients on the high frequency side in an integer signal code output from the determiner 5107 and outputs the resulting code as a new integer signal code. For example, the truncation unit 5116 removes a portion of code corresponding to quantized normalized coefficients on the high frequency side that correspond to the number of bits by which the number c of consumed bits exceeds the number B of allocated bits, c−B, from the integer signal code and outputs the remaining code as a new integer signal code. On the other hand, when the number c of consumed bits output from the determiner 5107 is not greater than the number B of allocated bits, the truncation unit 5116 outputs the integer signal code output from the determiner 5107.
<Gain Encoder 5117>
The gain encoder 5117 encodes the gain output from the determiner 5107 using a predetermined number of bits to obtain and output a gain code.
On the other hand, Patent Literature 1 describes a variable-length encoding method that uses periodicity to efficiently encode integer signals. In the method, a quantized normalized coefficient sequence is rearranged so that one or a plurality of successive samples including a sample corresponding to a fundamental frequency and one or a plurality of successive samples including a sample corresponding to an integer multiple of the fundamental frequency are put together. The rearranged sample string is encoded using variable-length encoding to obtain an integer signal code. This reduces variations in amplitude of adjacent samples to increase the efficiency of the variable-length encoding.
Patent Literature 1 also describes a method for obtaining an integer signal code by selecting one of two encoding methods, whichever uses or is expected to use fewer bits for an integer signal code; one of the encoding methods uses periodicity and encodes a rearranged sample string by using variable-length encoding to obtain an integer signal code whereas the other method does not use periodicity and encodes the original, unrearranged sample string by using variable-length encoding to obtain an integer signal code. This enables an integer signal code having a fewer bits with the same degree of encoding distortion to be obtained.