When time-series signals, such as audio signals and video information, are transmitted on a communication channel or recorded on an information recording medium, it is effective in terms of transmission efficiency or recording efficiency to transmit or record the time-series signals after they are converted to compressed codes. In recent years, increasing use of broadband access and increasing capacity of storage devices have caused greater importance to lossless compression encoding methods that allow the original signal to be reproduced losslessly, rather than lossy compression encoding methods that place the highest priority on high compression rates (refer to non-patent literature 1, for example). In such circumstances, a predictive coding has been approved as an international standard of Moving Picture Experts Group (MPEG), which is named as MPEG-4 ALS (see Non-patent literature 2, for example). The predictive coding is a lossless compression coding of acoustic signals based on a short-term prediction analysis, which is an autocorrelation analysis using adjacent time-series signals, and/or a long-term prediction analysis, which is an autocorrelation analysis using time-series signals apart from each other by a delay value (a pitch period).
FIG. 1 is a block diagram for illustrating a functional configuration of an encoder 2100 based on a conventional predictive coding scheme. FIG. 2 is a block diagram for illustrating a functional configuration of a decoder 2200 based on the conventional predictive coding scheme. FIG. 3A is a block diagram for illustrating a functional configuration of a residual coding unit 2120 shown in FIG. 1, and FIG. 3B is a block diagram for illustrating a functional configuration of a residual decoding unit 2220 shown in FIG. 2. FIG. 4 is a graph for illustrating a relationship between a prediction order and a code amount in a predictive coding scheme using a short-term prediction analysis. In FIG. 4, the abscissa represents the prediction order, and the ordinate represents the code amount. First, a conventional predictive coding scheme using a short-term prediction analysis will be described with reference to these drawings.
(Encoding Method)
Pulse code modulation (PCM) time-series signals x(n) that are sampled and quantized are input to a frame buffer 2111 in the encoder 2100 (FIG. 1). In the expression x(n), the character n denotes an index of a discrete time, and a discrete time corresponding to an index n is referred to as a “discrete time n”. Smaller indices n represent earlier discrete time. In addition, the time-series signal x(n) means a time-series signal at a discrete time n.
The frame buffer 2111 buffers time-series signals x(n) (n=0, . . . , N−1) in a predetermined time segment (referred to as a “frame” hereinafter) (the character N represents a predetermined integer equal to or greater than 2). A time segment including discrete times n=0, . . . , N−1 will be expressed as a “time segment (0, . . . , N−1)” hereinafter. One frame of time-series signals x(n) (n=0, . . . , N−1) buffered is passed to a short-term prediction analysis unit 2112 in a predictive coding unit 2110. The short-term prediction analysis unit 2112 calculates first-order to Popt-th-order PARCOR coefficients k(m) (m=1, 2, . . . , Popt) by short-term prediction analysis.
[Short-term Prediction Analysis and Optimum Prediction Order]
In the short-term prediction analysis, it is assumed that a linear combination of a time-series signal x(n) at a time n and P time-series signals x(n−1), x(n−2), . . . , x(n−P) at times n−1, n−2, . . . , n-P preceding the time n weighted with respective coefficients α(m) (m=1, . . . , P), is a prediction residual e(n) (the number P is referred to as a “prediction order”, the coefficient α(m) is referred to as a “short-term prediction coefficient”, and the prediction residual e(n) is referred to also as a “prediction error”). A linear prediction model based on the assumption is expressed by the following formula (1). In the linear prediction analysis, for the input time-series signals x(n) (n=0, . . . , N−1), coefficients, such as the short-term prediction coefficients α(m) (m=1, 2, . . . , P) that minimize the energy of the prediction residuals e(n) (n=0, . . . , N−1), or PARCOR coefficients k(m) (m=1, 2, . . . , P) that can be converted into the short-term prediction coefficients, are calculated.e(n)=x(n)+α(1)·x(n−1)+α(2)·x(n−2)+ . . . +α(P)·x(n−P)  (1)
Specific examples of the short-term prediction analysis include sequential methods, such as the Levinson-Durbin method and the Burg method, and methods of solving, for each prediction order, the simultaneous equations whose solutions are short-term prediction coefficients that minimize the prediction residuals, such as the autocorrelation method and the covariance method.
A linear finite impulse response (FIR) filter expressed by the following formula (2) for estimating a time-series signal y(n) at a time n from P time-series signals x(n−1), x(n−2), . . . , x(n−P) at preceding times n−1, n−2, . . . , n-P is referred to as a “short-term prediction filter”.y(n)=−{α(1)·x(n−1)+α(2)·x(n−2)+ . . . +α(P)·x(n−P)}  (2)
The character Popt denotes a positive integer that represents an optimum prediction order P, which is referred to as an “optimum prediction order”. In the scheme disclosed in Non-patent literature 2, the optimum prediction order Popt is determined based on the minimum description length (MDL) principle. According to the MDL principle, the best model is a model that minimizes the code word length, which is equal to the sum of the description length of the model and the description length of data by the model. That is, according to the scheme disclosed in Non-patent literature 2, the optimum prediction order Popt is a prediction order P that minimizes the code amount required for lossless decoding.(code amount required for lossless decoding)=(code amount required for PARCOR coefficients)+(code amount required for a prediction residual)  (3)
As schematically shown by the straight line 4A in FIG. 4, the code amount required for PARCOR coefficients increases in proportion to the prediction order. In addition, in general, as the prediction order increases, the energy of the prediction residual decreases, and the code amount in a case of performing entropy coding of the prediction residual decreases logarithmically as schematically shown by the curve 4B. Therefore, as schematically shown by the curve 4C that is a sum of the straight line 4A and the curve 4B, the code amount required for lossless decoding does not monotonically decreases as the prediction order increases but is minimized for a certain prediction order. The short-term prediction analysis unit 2112 searches a range of integers from an integer equal to or greater than a minimum prediction order Pmin to an integer equal to or smaller than a maximum prediction order Pmax and determines, as the optimum prediction order Popt, a prediction order that minimizes the code amount required for lossless decoding.
As an alternative to the adaptive determination of the optimum prediction order Popt described above, the optimum prediction order Popt may be a fixed value (this is the end of the description of [Short-term prediction analysis and optimum prediction order]).
The calculated PARCOR coefficients k(m) (m=1, 2, . . . , Popt) are passed to a quantizer 2113, and the quantizer 2113 quantizes the PARCOR coefficients k(m) to produce quantized PARCOR coefficients i(m) (m=1, 2, . . . , Popt). The quantized PARCOR coefficients i(m) (m=1, 2, . . . , Popt) are passed to a coefficient coding unit 2114, and the coefficient coding unit 2114 performs variable length coding of the quantized PARCOR coefficients i(m). The quantized PARCOR coefficients i(m) (m=1, 2, . . . , Popt) are also passed to a short-term prediction coefficient converter 2115. The optimum prediction order Popt is also fed to the short-term prediction coefficient converter 2115, and the short-term prediction coefficient converter 2115 uses these to calculate the short-term prediction coefficients α(m) (m=1, 2, . . . , Popt). Then, a short-term prediction unit 2116 calculates short-term prediction values y(n) (n=0, . . . , N−1) according to the short-term prediction filter (formula (2)) at P=Popt, using the time-series signals x(n) (n=0, . . . , N−1) in one frame, the short-term prediction coefficients α(m) (m=1, 2, . . . , Popt) and the optimum prediction order Popt. Then, a subtraction unit 2117 calculates the prediction residuals e(n) by subtracting the short-term prediction values y(n) from the time-series signals x(n), respectively (a prediction filter processing).
The prediction residuals e(n) (n=0, . . . , N−1) are integers within a predetermined range. For example, when the input time-series signals x(n) are expressed in integers with a finite number of bits, and the linear prediction values y(n) are output values of a linear prediction filter where the filter coefficients are integer linear prediction coefficients obtained, for example, by rounding off decimal places, the prediction residuals e(n) in integer representation with a finite number of bits (or represented by integers within a predetermined range) can be obtained by subtracting the linear prediction values y(n) from the time-series signals x(n), respectively. When the time-series signals x(n) or the linear prediction values y(n) are not expressed in integers, the prediction residuals e(n) may be obtained by expressing the differences calculated by subtracting the linear prediction values y(n) from the time-series signals x(n), with integers having a finite number of bits, respectively. The residual coding unit 2120 (FIG. 3A) performs Golomb-Rice coding of the prediction residuals e(n) (n=0, . . . , N−1) represented by integers. In the Golomb-Rice coding, first, a parameter calculator 2121 generates a parameter s, which is an integer, using the input prediction residuals e(n) (n=0, . . . , N−1) (the parameter s is sometimes referred to as a “Rice parameter”).
[Generation of Parameter s]
An optimum value of the parameter s depends on the amplitude of the input prediction residuals e(n) (n=0, . . . , N−1). Typically, it is assumed that the prediction residuals e(n) in a certain discrete time segment, such as a frame and a sub-frame that is a time segment obtained by dividing the frame, have an uniform amplitude, and the parameter s is set for each discrete time segment based on the average amplitude of the prediction residuals e(n) in the segment.
However, for a discrete time segment (a frame, a sub-frame or the like) randomly accessed, the assumption that all the prediction residuals e(n) in the discrete time segment have an uniform amplitude does not hold. That is, for a discrete time segment randomly accessed, any time-series signal before the discrete time segment cannot be used for calculation with the short-term prediction filter (formula (2)). Therefore, for discrete times from the first discrete time to the Popt-th discrete time in the discrete time segment, the number of time series signals that can be used for calculation with the short-term prediction filter is limited to less than the optimum prediction order Popt. As a result, the prediction residuals e(n) at the first discrete time to the Popt-th discrete time in the discrete time segment often have larger amplitude than the prediction residuals e(n) at the Popt+1-th and following discrete times.
Thus, as illustrated below, according to the method disclosed in Non-patent literature 2, a value uniquely determined from the length of the bits representing the time-series signal x(n) is used as the parameter s at the discrete time n=0, a value obtained by adding a fixed value to a parameter determined from the average amplitude of the prediction residuals e(n) at the discrete time n=3 and the following discrete times is used as the parameter s at the discrete times n=1, 2, and the parameter determined from the average amplitude of the prediction residuals e(n) at the discrete time n=3 and the following discrete times is used as the parameter s at the discrete times n=3, . . . , N−1. For example, the length of the bits representing the time-series signal x(n) minus 4 is used as the parameter s at the discrete time n=0, the parameter determined from the average amplitude of the prediction residuals e(n) plus 3 is used as the parameter s at the discrete time n=1, the parameter determined from the average amplitude of the prediction residuals e(n) plus 1 is used as the parameter s at the discrete time n=2, and the parameter determined from the average amplitude of the prediction residuals e(n) is used as the parameter s at the discrete times n=3, . . . , N−1 (this is the end of the description of [Generation of parameter s]).
Then, the prediction residuals e(n) (n=0, . . . , N−1) and the parameters s are input to a separating calculator 2122a in a coding unit 2122. The separating calculator 2122a performs a predetermined division using these values to calculate integer quotients q(n) (n=0, . . . , N−1) and information sub(n) (n=0, . . . , N−1) that represents the reminders thereof. Essentially, the division is a calculation that divides the prediction residual e(n) by 2s. However, because of the necessity of distinguishing between positive and negative values of the prediction residuals e(n), reduction of the code length, and so on, some modifications may be added to the operations to divide the prediction residuals e(n) simply by the modulus 2s. Then, a variable length coding unit 2122b performs Alpha coding of the quotients q(n) to produce information prefix(n). The generated information prefix(n) and the information sub(n) are input to a combining unit 2122c. The combining unit 2122c outputs residual codes Ce(n) corresponding to the prediction residuals e(n), each of which is a bit combination value prefix(n)|sub(n) of the information prefix(n) and the information sub(n). In addition to the residual codes Ce(n), the residual coding unit 2120 output the parameters s.
The optimum prediction order Popt selected by the short-term prediction analysis unit 2112, a coefficient code Ck generated by the predictive coding unit 2110, and the residual codes Ce(n) and the parameters s generated by the residual coding unit 2120 are passed to a combining unit 2130, and the combining unit 2130 combines them to produce a code Cg.
(Decoding Method)
The code Cg input to the decoder 2200 (FIG. 2) is separated by a separator 2210 into the optimum prediction order Popt, the coefficient code Ck, the residual codes Ce(n) (n=0, . . . , N−1) and the parameters s. The optimum prediction order Popt and the coefficient code Ck are input to a predictive decoding unit 2230, and the residual codes Ce(n) (n=0, . . . , N−1) and the parameters s are input to the residual decoding unit 2220.
A separator 2221 in the residual decoding unit 2220 (FIG. 3B) separates the input residual codes Ce(n) into the information prefix(n) and the information sub(n), respectively. A variable length decoding unit 2222 decodes the resulting information prefix(n) to produce the quotients q(n). The information sub(n), the quotients q(n) and the parameters s are input to a combining calculator 2223, and the combining calculator 2223 reproduces the prediction residuals e(n) using these values.
On the other hand, the coefficient code Ck is input to a coefficient decoding unit 2231 in the predictive decoding unit 2230. The coefficient decoding unit 2231 decodes the coefficient code Ck to produce the quantized PARCOR coefficients i(m) (m=1, 2, . . . , Popt). The quantized PARCOR coefficients i(m) (m=1, 2, . . . , Popt) are passed to a short-term prediction coefficient converter 2232. The short-term prediction coefficient converter 2232 uses the quantized PARCOR coefficients i(m) (m=1, 2, . . . , Popt) to calculate each short-term prediction coefficients α(m) (m=1, 2 . . . , Popt) to the short-term prediction filter (formula (2)) of the optimum prediction order Popt. A short-term prediction unit 2233 generates short-term prediction values y(n) (n=0, . . . , N−1) according to the short-term prediction filter (formula (2)) for P=Popt, using the calculated short-term prediction coefficients α(m) (m=1, 2, . . . , Popt) and the time-series signals x(n) previously output from an adder 2234. The adder 2234 sums the short-term prediction values y(n) and the prediction residuals e(n) reproduced by the residual decoding unit 2220 to produce the lossless decoded values x(n) (n=0, . . . , N−1) of the time-series signals, respectively (an inverse prediction filtering processing).