Compressive coding art are used in transmitting audio signal data or image information data over communication lines or recording such data on recording media in these years. Lossless compression of floating-point data which can be readily edited and processed is also important and such coding techniques are disclosed in Non-patent literature 1 and patent literature 1, for example. In these coding methods a sequence of multiple floating-point data samples are grouped for every plural samples into a frame. A bit shift amount is determined for each individual frame so that the largest amplitude value in the frame is the maximum value in a range of amplitudes that can be represented in an integer format of a given number of bits. The bit shift amount thus determined is used to separate each sample into an integer signal and an error signal, each of which is then coded, frame by frame.
Although not shown in patent literature 1, a functional configuration for coding that can be implemented according to the art disclosed therein is shown in FIG. 1. A coding apparatus 800 includes a frame buffer 810, a sift amount calculating section 820, an integer signal/error signal separator 830, an integer signal coder 840, an error signal coder 850, and a multiplexer 860.
The concept of the coding is shown in FIG. 2. Each frame includes multiple sample values, each being formed by a bit stream containing a finite number of significant digits. FIG. 2 shows floating-point notation in which a mantissa is represented by a predetermined number of quantized bits, for example 32 bits, excluding the sign bit. Each string of bits running horizontally represents one sample. Each of the shaded bits in the representation in FIG. 2, which are significant digits in floating-point form that correspond to the predetermined most significant digits in floating-point notation and the digits represented by the mantissa in floating-point notation, contains a 0 or 1; the other bits which do not correspond to significant digits contain 0s. To encode sample values frame by frame, the sample values in the frame are separated into an integer part and an error part (all or part of an input signal excluding the integer part). The dashed-line boxes in FIG. 2 indicate integer parts. An integer part is determined by shifting all samples in a frame by the same number of bits in the same direction so that the largest amplitude value in the frame is the maximum value that can be represented by the integer part. The separated integer part and error part are separately coded and are then combined into coded data.
The concept shown in FIG. 2 can be applied to integer representations as well as floating-point representations. The same method can be applied to any representation in which only a bit string starting from the most significant bit (MSB), which represents the amplitude, to the least significant bit (LSB) a finite number of bits away from the MSB can contain 0s or 1s and the other bits are all 0s. For example, a 32-bit or 64-bit integer representation of each sample may include particular 24 bits each containing a 0 or 1 and the other bits containing 0s.
A typical floating-point representation is the IEEE 754 32-bit floating-point format. The floating-point is represented as
[Equation 1](−1)S×1·M×2E-E0  (1)where S denotes a sign part, M denotes a mantissa, and E denotes an exponent. According to IEEE 754, the sign part S is represented by 1 bit, the mantissa M by 23 bits, and the exponent E by 8 bits. Any value is represented by a total of 32 bits of the floating-point format represents, where E0=27−1=127. Accordingly, E−E0 in Equation 1 can take any integer value in the range −127≦E−E0≦128. If E−E0=−127, the binary representation of the sample value is all 0s; if E−E0=128, the binary representation of the sample value is all 1s. That is, in this floating-point notation, a sample value is normalized so that that decimal place is between the most significant bit of the binary representation of the sample value that contains 1 and the next significant bit and the 23 bits after the decimal place excluding the MSB containing 1 are represented by the M. The number of digits of the integer part of the binary representation of the sample value is equal to E−E0 plus 1.
The sample with the largest amplitude in a frame can be made the maximum value that can be represented by an integer part consisting of Q quantized bits through bit shift by normalizing the sample value by shifting the sample value ΔEmax bits toward the LSB so that the MSB is in the one's place and then shifting Q−1 bits toward the MSB, where ΔEmax is the exponent of the sample with the largest amplitude and ΔEmax=E−E0. The result is that the sample value is bit-shifted by Q−1−ΔEmax. Since the number of quantized bits Q is a predetermined fixed value, ΔEmax=Sj is referred to as the bit-shift amount of a frame j for convenience. In the following description, an example will be described in which the number of quantized bits Q of the signal of an integer part is 24, including the sign bit, all sample values in a frame are shifted by the same number of bits, and the signal of the integer part (hereinafter referred to as the “integer signal”) and the signal of the error part (hereinafter referred to as the “error signal”) are separately coded.
FIG. 3 shows a possible processing flow in the coding apparatus 800 shown in FIG. 1. The frame buffer 810 temporarily stores digital input signal sample values and forms a frame with NF sample values Xi (i=1, . . . , NF) (S810). The shift amount calculating section 820 determines a shift amount Sj for each frame by using the method described with reference to FIG. 2 (S820). The integer signal/error signal separator 830 uses the shift amount Sj to separate each of the NF samples in the frame input signal into an integer part and an error part (S830). The integer signal coder 840 encodes the integer signal separated in the integer signal/error signal separator 830 by using linear predictive coding (S840). The error signal coder 850 encodes the error signal separated in the integer signal/error signal separator 830 (S850). The multiplexer 860 combines the code representing the coded integer signal, the code representing the error signal, and the shift amount to provide coded data (S860). Because the number of quantized bits Q of the integer part is predetermined, (Q−1−Sj) can be obtained from the shift amount Sj received at a decoding end.
FIG. 4 shows details of a possible exemplary processing flow (step S820 of FIG. 3) in the shift amount calculating section 820 in FIG. 1. In the exemplary processing, a sample value is represented in the IEEE 754 32-bit floating-point format. A similar processing flow is described in patent literature 1. The shift amount calculating section 820 first reads all samples (NF samples) in a frame input signal (S8201). Then, an initial value of 1 is set in a variable i and −127 (=E0) is set in ΔEmax (S8202). The exponent E−E0, that is, Ei−127, of the i-th sample in the current frame is calculated and assigned to variable ΔEi (S8203). Decision is made as to whether ΔEi>ΔEmax (S8204). If this is true, ΔEi is set as ΔEmax (S8205).
Then decision is made as to whether i<NF (S8206). If i<NF, then i+1 is assigned to i (S8207) and the process returns to step S8203; otherwise, decision is made as to whether ΔEmax>−127 (S8208). If ΔEmax>−127, then ΔEmax is obtained as the shift amount Sj (S8209) and the process will end. If ΔEmax≦−127, all samples in the frame are 0 and therefore the shift amount Sj is set to 0(S8210). This processing is equivalent to determining the bit shift amount Sj, specifically (Q−1−Sj), such that the largest amplitude of the sample in the frame is assigned to the largest amplitude in the range between the maximum value and the minimum value that can be represented by the integer part by bit-shifting the sample values.
FIG. 5 shows a variation (step S820′) of the possible processing flow at the shift amount calculation step (step S820) of FIG. 3. A sample represented in the 32-bit floating-point format of the IEEE 754 contains a special value such as a NaN (Not a Number) or an unnormalized number if E−E0 is 128 or −127. This variation differs from the processing shown in FIG. 4 in that only the values within the range −127<E−E0<128 among the samples in a frame are used to calculate the shift amount in determining the largest amplitude. Furthermore, in analysis of the i-th sample, the decimal point of the i-th sample is moved by using ΔEmax obtained so far and decision is made as to whether the value after the place shift is in the range that can be represented by a given number of quantized bits Q. If the value exceeds the range that can be represented by the given number of quantized bits Q as a result of the place shift, then 1 is added to ΔEmax so that the value does not exceed the range, which is another difference from the processing of FIG. 4.
Specifically, the processing flow differs as follows. Step S8221 is added between steps S8202 and S8203, where decision is made as to whether −127<Ei−127<128 (S8221). If this is true, the process proceeds to step S8203; otherwise the step proceeds to step S8206. Furthermore, step 8220 is added between steps S8205 and S8206. At step S8220, first Xi multiplied by 2 to the power of (Q−1−ΔEmax) (that is, the value of Xi shifted by Q−1−ΔEmax bits) is assigned to X′i (S8222). Decision is made as to whether X′i>2Q−1−1 or whether X′i<−2Q−1 (S8223). If step S8223 is true, 1 is added to ΔEmax (S8224); otherwise the process proceeds to step S8206.
FIG. 6 shows a detailed possible procedure for separating an input signal Xi into an integer signal Yi and an error signal Zi using the shift amount Sj obtained at step S830 of FIG. 3. The following process is sequentially performed for each of NF samples Xi. NF samples are taken from the frame buffer into the inside memory (S8301). An initial value of 1 is assigned to i which indicates the number of a sample (S8302). Decision is made as to whether the exponent (Ei−127) of the input sample Xi is greater than −127 and less than 128 (S8303). If it is determined at step S8303 that the exponent is out of the range given above, the i-th sample has the value 0 or a special value such as an unnormalized value or NaN. Therefore, 0 is assigned to the integer part Yi of the sample after digit alignment and Xi is assigned to the error part Zi (S8309).
On the other hand, if it is determined at step S8303 that the exponent value is within the range, Xi is multiplied by 2 to the power of (Q−1−Sj) to obtain X′i (S8304). This means that if (Q−1−Sj) is positive, Xi is shifted by (Q−1−Sj) bits to the left and if (Q−1−Sj) is negative, Xi is shifted by (Q−1−Sj) bits toward the LSB. Alternatively, E′i in the exponent value (E′i−127) of X′i is obtained from the exponent part Ei of sample Xi as E′i=Ei+(Q−1−Sj). This processing is equivalent to shifting all samples by (Q−1−Sj) bits to align decimal points so that the sample with the largest amplitude in the frame does not exceed the maximum amplitude that can be represented by the number of quantized bits Q of the integer part, by multiplying each of the samples in the frame by 2 to the power of (Q−1−Sj) which is common to all the samples.
Decision is made as to whether the obtained exponent value of X′i; (E′i−127) is greater than −127 and less than 128 (S8305). If the exponent part is out of the range, 0 is assigned to the integer part Yi (S8309). If the exponent value is within the range, decision is made as to whether X′i is positive (S8306). If X′i is positive, the digits after the decimal point of X′i is discarded and the rounded value is set as the integer part Yi (S8307). If X′i is negative, the digits after the decimal point of X′i are rounded up and the rounded value is set as the integer part Yi (S8308). If Yi is not zero, the decimal portion of X′i is set as the error part Zi (S8307 and S8308). Decision is made as to whether i is less than NF (S8310). If i is less than NF, i+1 is assigned to i (S8311). If i is greater than or equal to NF, the process will end. Separation between the integer signal and the error signal is not limited to the procedure described above, a number of separation methods are described in patent literature 1.
FIG. 7 shows a possible functional configuration of the integer signal coder 840 shown in FIG. 1. The integer signal coder 840 includes a segmentation section 8401, a linear prediction analyzing section 8402, a linear prediction coefficient coder 8403, a linear prediction coefficient decoder 8404, an inverse filter 8407, a sample buffer 8408, a residue signal coder 8409, and a multiplexer 8410. The segmentation section 8401 subdivides a frame of digital sampling value strings of an input integer signal into subframes. If frames are not to be subdivided, the segmentation section 8401 can be omitted. Hereinafter, division into frames and division into subframes are collectively referred to as framing.
The linear prediction analyzing section 8402 performs linear prediction analysis of a framed input integer signal (hereinafter referred to as an “input integer signal”) and outputs linear prediction coefficients. The order of the linear prediction coefficient is denoted by P. The linear prediction coefficient coder 8403 encodes the linear prediction coefficients provided by the linear prediction analyzing section 8402 and outputs a linear prediction coefficient code. The linear prediction coefficient decoder 8404 decodes the output from the linear prediction coefficient coder 8403 and outputs P-order quantized linear prediction coefficients. In this example, the output from the linear prediction coefficient coder 8403 is decoded by the linear prediction coefficient decoder 8403 to obtain quantized linear prediction coefficients. However, the linear prediction coefficient decoder 8404 may be omitted and a linear prediction coefficient code and its corresponding quantized linear prediction coefficients may be obtained from the linear prediction coefficient coder 8403.
The inverse filter 8407 restores a signal transmitted as a linear prediction coefficient code by using the P-order quantized linear prediction coefficients outputted from the linear prediction coefficient decoder 8404 and sample values in the previous frame held in the sample buffer 8408 and sample values in the current frame. The inverse filter 8407 also subtracts the signal transmitted as the linear prediction coefficient code restored from the input integer signal to output a residue signal. At least last P samples of the sample values in the current frame are held in the sample buffer 8408. The residue signal coder 8409 codes the residue signal outputted from the inverse filter 8407 and outputs a residue code. A multiplexer 8410 combines the linear prediction coefficient code outputted from the linear prediction coefficient coder 8403 with the residue code outputted from the residue signal coder 8409 and outputs the combined result as an integer signal code. The linear prediction analyzing section 8402 may also use the last P samples in the previous frame for linear prediction analysis. In this case, the linear prediction analyzing section 8402 receives the last P samples of the previous frame from the sample buffer 8408 as indicated by the dashed line and box in FIG. 7.
FIG. 8 shows a possible functional configuration of a decoding apparatus corresponding to the coding apparatus 800 shown in FIG. 1. FIG. 9 shows a processing flow in the decoding apparatus 900. The decoding apparatus 900 includes a demultiplexer 910, an integer signal decoder 920, an error signal decoder 930, an integer/error signal combiner 940. The integer/error signal combiner 940 includes a reverse shifter 950 and an error component adder 960. The demultiplexer 910 stores and demultiplexes coded data (S910). The integer signal decoder 920 decodes the integer signal (S920). The error signal decoder 930 decodes an error signal (S930). The reverse shifter 950 of the integer/error signal combiner 940 reversely shifts (shifting opposite in direction to the shift in coding) the decoded integer signal in accordance with a shift amount outputted from the demultiplexer (S950). The error component adder 960 of the integer/error signal combiner 940 combines the reversely shifted integer signal with the error signal (S960).
FIG. 10 shows a possible exemplary functional configuration of the integer signal decoder 920 in FIG. 8. The integer signal decoder 920 includes a demultiplexer 9201, a linear prediction coefficient decoder 9202, a residue signal decoder 9203, a sample buffer 9206, and a synthesis filter 9207. Coded data is received and stored at the demultiplexer 9201, where it is demultiplexed into a linear prediction coefficient code and a residue code. The linear prediction coefficient decoder 9202 decodes the linear prediction coefficient code and outputs linear prediction coefficients. The residue signal decoder 9203 decodes the residue code and outputs a residue signal. The synthesis filter 9207 uses the linear prediction coefficients outputted from the linear prediction coefficient decoder 9202 and the sample vales in the previous frame held in the sample buffer 9206 and the sample values in the current frame to synthesize a signal. The synthesis filter 9207 also adds the restored signal and the residue signal together to obtain an integer signal.
An input signal in integer form can be losslessly coded by performing linear prediction, for example, and applying lossless coding to linear prediction coefficients and linear prediction residues separately as described in Non-patent literature 2. In the coding method described in Non-patent literature 2, linear prediction coefficients are obtained for each frame of input data sample value strings in integer form, then the linear prediction coefficients are coded, an inverse filter (also called an analysis filter) is formed by using the linear prediction coefficients quantified in the coding process, a linear prediction residue signal is obtained, and the linear prediction residue signal is coded.
Non-patent literature 1: Dai Yang and Takehiro Moriya, “Lossless Compression for Audio Data in the IEEE Floating-Point Format,” AES Convention Paper 5987, AES115th Convention, New York, N.Y., USA, 2003 Oct. 10-13
Non-patent literature 2: Tilman Liebchen and Yuriy A. Reznik, “MPEG-4 ALS: An Emerging Standard for Lossless Audio Coding,” Proceedings of the Data Compression Conference (DCC '04), pp. 1068-0314/04, 2004
Patent literature 1: Brochure of WO2004/114527