This invention relates to a coding device and method for generating a code string by changing the compression rate of a code string generated by code string generation processing in accordance with limitation of the capacity of a transmission line or the like. The invention also relates to a decoding device and method for decoding a code string having the compression rate changed in accordance with the coding device and method. The invention also relates to a program recording medium for recording the coding method and the decoding method as software programs. The invention further relates to a data recording medium in which a code string having the compression rate changed in accordance with the coding method is recorded.
There are various techniques of high-efficiency coding of audio signals (including speech signals). For example, there is known a subband coding (SBC) technique, which is a non-blocked frequency subband coding system for splitting audio signals on the time base into a plurality of frequency bands and coding the plurality of frequency bands without blocking the audio signals, and a blocked frequency subband coding system, that is, a so-called transform coding system for converting (by spectrum conversion) signals on the time base to signals on the frequency base, then splitting the signals into a plurality of frequency bands, and coding the signals of each band. Also, a high-efficiency coding technique which combines the above-described subband coding and transform coding is considered. In this case, after band splitting is carried out in accordance with the subband coding, the signals of each band are spectrum-converted to signals on the frequency base and the spectrum-converted signals of each band are coded.
As a filter for the above-described band splitting, a QMF (quadrature mirror filter) is employed. This QMF filter is described in R. E. Crochiere, Digital coding of speech in subbands, Bell Syst. Tech. J. Vol. 55, No. 8, 1976. Also, a bandwidth filter splitting technique is described in Joseph H. Rothweiler, Polyphase Quadrature filtersxe2x80x94A new subband coding technique, ICASSP 83, BOSTON.
As the above-described spectrum conversion, there is known spectrum conversion in which input audio signals are blocked on the basis of a predetermined unit time (frame) and converted from the tune base to the frequency base by carrying out discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified discrete cosine transform (MDCT) for each block. MDCT is described in J. P. Princen, A. B. Bradley, Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation, Univ. of Surrey, Royal Melbourne Inst. of Tech., ICASSP 1987.
As the signals split into each band by filtering or spectrum conversion are thus quantized, a band where quantization noise is generated can be controlled and more auditorily efficient coding can be carried out by utilizing the characteristics such as a masking effect. If normalization is carried out for each band with the maximum value of absolute values of signal components in each band before quantization is carried out, more auditorily efficient coding can be carried out.
With respect to the frequency splitting width for quantizing each frequency component obtained by frequency band splitting, for example, band splitting in consideration of human auditory characteristics is carried out. Specifically, audio signals are split into a plurality of bands (for example, 25 bands) with a bandwidth broader in higher frequency areas, generally referred to as critical bands. In coding the data of each band in this case, predetermined bit distribution for each band or adaptive bit allocation for each band is carried out. For example, in coding coefficient data obtained by MDCT processing by using bit allocation, the MDCT coefficient data of each band obtained by MDCT processing for each block is coded with an adaptive number of allocated bits. Two techniques for such bit allocation are known.
One technique is disclosed in R. Zelinski and P. Noll, Adaptive Transform Coding of Speech Signals, IEEE Transactions of Acoustics, Speech, and Signal Processing, vol. ASSP-25, No. 4, August 1977. In this technique, bit allocation is carried out on the basis of the magnitude of signals of each band. In accordance with this technique, the quantization noise spectrum is flat and the noise energy is minimum. However, since the masking effect is not utilized auditorily, the actual sense of noise is not optimum.
The other technique is disclosed in M. A. Kransner, The critical band coderxe2x80x94digital encoding of the perceptual requirements of the auditory system, MIT, ICASSP 1980. In this technique, fixed bit allocation is carried out by utilizing the auditory masking effect and thus obtaining a necessary signal-to-noise ratio for each band. In this technique, however, since bit allocation is fixed, a satisfactory characteristic value is not obtained even when characteristics are measured with a sine-wave input.
In order to solve these problems, there is proposed a high-efficiency coding device for divisionally using all the bits usable for bit allocation, for a predetermined fixed bit allocation pattern of each subblock and for bit distribution depending upon the magnitude of signals of each block, and causing the division ratio to depend upon the signals related with input signals so that the division rate for the fixed bit allocation is increased as the spectrum of the signals becomes smoother.
According to this method, in the case where the energy is concentrated at a specified spectrum as in a sine wave input, a large number of bits are allocated to a block including that spectrum, thereby enabling significant improvement in the overall signal-to-noise characteristic. Since the human auditory sense is generally acute to a signal having a steep spectral component, improvement in the signal-to-noise characteristic by using such a method not only leads to improvement in the numerical value of measurement but also is effective for improving the sound quality perceived by the auditory sense.
In addition to the foregoing methods, various other methods for bit allocation are proposed. Therefore, if a fine and precise model with respect to the auditory sense is realized and the capability of the coding device is improved, auditorily more efficient coding can be carried out.
For example, the present Assignee has proposed a method for separating tonal components which are particularly important in terms of the auditory sense from spectral signals and coding these tonal components separately from the other spectral components. Thus, it is possible to efficiently code audio signals at a high compression rate without generating serious deterioration in the sound quality perceived by the auditory sense.
In the case where DFT or DCT is used as a method for converting waveform signals to the spectrum, M units of independent real-number data are obtained by carrying out conversion with a time block consisting of M samples. In general, M1 samples of each of adjacent blocks are caused to overlap each other in order to reduce connection distortion between time blocks. Therefore, in DFT or DCT, M units of real-number data are quantized and coded with respect to (M-M1) samples on the average.
On the other hand, in the case where MDCT is used as a method for conversion to the spectrum, M units of independent real-number data are obtained from 2M samples having M samples caused to overlap M samples of the adjacent period. Therefore, M units of real-number data are quantized and coded with respect to M samples on the average.
In a decoding device, waveform elements obtained by inversely converting each block of codes thus obtained by using MDCT are added to each other while being caused to interfere with each other. Thus, waveform signals can be reconstituted.
In general, by elongating the time block for conversion, the frequency resolution of spectrum is increased and the energy is concentrated at a specified spectral component. Therefore, more efficient coding than in the case where DFT or DCT is used can be carried out by using MDCT in which adjacent blocks are caused to overlap each other by half so as to carry out conversion with a large block length and in which the number of resultant spectral signals is not increased from the number of original time samples. Also, the inter-block distortion of waveform signals can be reduced by causing adjacent blocks to have sufficiently long overlap.
In actual generation of a code string, first, quantization precision information and normalization coefficient information are coded with a predetermined number of bits for each band to be normalized and quantized, and then the normalized and quantized spectral signals may be coded.
For coding spectral signals, a method using a variable-length code such as a Huffman code is known. The Huffman code is described in David A. Huffman, A Method for Construction of Minimum Redundancy Codes, Proceedings of the I. R. E., pp. 1098-1101, September 1952.
Generally, with respect to a code string generated by a coding device, sub information S made up of the quantization precision and normalization coefficient and main information M made up of the quantization spectrum are arranged in this order, as shown in FIG. 1, in each code string block constituted by coded data obtained by coding a time signal for each predetermined time. The sub information S is auxiliary information for restoring original spectral components and includes a plurality of parameters such as sub information S1, S2, . . . , Sn.
Meanwhile, in some cases, a code string having the compression rate changed in accordance with a change of the transmission line capacity of a transmission medium is produced from a code string which is once generated. In general, in regenerating a code string having a changed compression rate from a predetermined code string, the predetermined code string is once decomposed, and decomposition of the code string and decoding of signal components are carried out for adjusting the number of bits. Then, calculation for bit redistribution and change of the quantization precision and normalization coefficient are carried out in addition to limitation of the frequency band. Then, re-quantization and generation of a code string are carried out.
In the conventional method, however, in generating a code string having a changed compression rate from a code string outputted from the coding device, the operation scale substantially similar to that of decoding and coding of acoustic waveform signals is required. Therefore, the conventional method is not suitable for processing which requires high-speed operation, for example, real-time processing for converting the compression rate.
In view of the foregoing status of the art, it is an object of the present invention to provide a coding device and method which enables generation of a code string having a compression rate changed at a high-speed with a small quantity of operation.
In view of the foregoing status of the art, it is another object of the present invention to provide a decoding device and method for decoding a code string having a compression rate changed at a high speed with a small quantity of operation.
It is still another object of the present invention to provide a program recording medium in which a program enabling generation of a code string having a compression rate changed at a high speed with a small quantity of operation is recorded, and a program recording medium in which a program enabling decoding of the code string is recorded.
It is a further object of the present invention to provide a data recording medium in which a code string having a compression rate changed at a high speed with a small quantity of operation is recorded.
In order to solve the foregoing problems, in a coding device and method according to the present invention, when a code string is to be generated from an input signal, a code string equivalent to minimum necessary information for decoding an entire code string block equivalent to a frame, that is, each time unit, is arranged at a leading part of the code string block. In the remaining part, codes such as a normalization coefficient, the number of quantization steps and a spectrum coefficient corresponding to a partial spectral component are collectively used as a unit, and code strings are stored in the order from a code string of the highest importance for decoding a part of the code string block.
Then, a code string having a different length in accordance with a selected compression rate is cut out from the leading part of the code string block of each unit time, thus enabling regeneration of a code string of a different length. Therefore, a code string having a changed compression rate can be generated at a high speed with a small quantity of operation or a simple structure.
Also, in a decoding device and method according to the present invention, to decode codes generated by coding a signal of each predetermined unit time on the side of a coding device, a code string having partial code strings, including auxiliary data for decoding generated for each of a plurality of frequency bands from the codes on the side of the coding device and main data expressing components of the signal, arrayed in a predetermined order from a leading part of a code string block of each predetermined unit time is decomposed into the codes, and an output signal is generated on the basis of the codes obtained by decomposition.
Also, in a program recording medium according to the present invention, a coding program is recorded. The coding program includes a transform step of converting an input signal to a plurality of units of information of each frequency band, a coding step of coding the information of each band from the transform step, and a code string generation step of generating a plurality of partial code strings made up of auxiliary data and main data with respect to codes equivalent to information of each predetermined unit time from the coding step and rearranging the partial code strings in the order from a partial code string of the highest importance from a leading part of a code string block of each predetermined unit time, thus generating a code string.
Also, in a program recording medium according to the present invention, a decoding program for decoding codes generated by coding a signal of each predetermined unit time on the side of a coding device is recorded. The decoding program includes a decomposition step of decomposing into the codes a code string having partial code strings, including auxiliary data for decoding generated for each of a plurality of frequency bands from the codes on the side of the coding device and main data expressing components of the signal, arrayed in a predetermined order from a leading part of a code string block of each predetermined unit time, and a signal generation step of generating an output signal on the basis of the codes obtained by decomposition of the decomposition step.
Moreover, in a data recording medium according to the present invention, a code string is recorded. The code string is generated by converting an input signal to a plurality of units of information of each of a plurality of frequency bands, coding the information of each band, forming a plurality of partial code strings made up of auxiliary data and main data with respect to codes equivalent to information of each predetermined unit time, and rearranging the plurality of partial code strings in the order from a partial code string of the highest importance from a leading part of a code string block of each predetermined unit time.
FIG. 1 shows the format of a code string block generated by a conventional coding device.
FIG. 2 is a block diagram showing an audio coding device as an embodiment of the coding device and method according to the present invention.
FIG. 3 is a block diagram showing details of a transform circuit constituting the audio coding device.
FIG. 4 is a block diagram showing details of a code string generation circuit constituting the audio coding device.
FIG. 5 shows the level of absolute value of spectral components from the transform circuit, in decibel.
FIG. 6 shows the format of an exemplary code string block generated by the code string generation circuit.
FIG. 7 shows the format of another exemplary code string block generated by the code string generation circuit.
FIG. 8 is a flowchart for explaining the flow of processing in a compression rate change circuit constituting the audio coding device.
FIG. 9 is a block diagram showing the structure of an exemplary decoding device for decoding an audio signal from a code string generated by the audio coding device shown in FIG. 2.
FIG. 10 is a block diagram showing details of an inverse transform circuit constituting the decoding device.
FIG. 11 is a block diagram showing the structure of another exemplary decoding device for decoding an audio signal from a code string generated by the audio coding device shown in FIG. 2.
FIG. 12 shows an exemplary structure of an embodiment of a transmission system to which the present invention is applied.
FIG. 13 is a block diagram showing an exemplary hardware structure of a server 61 of FIG. 12.
FIG. 14 is a block diagram showing an exemplary hardware structure of a client terminal 63 of FIG. 12.