1. Field of the Invention
The present invention relates to a method and an apparatus for transforming coded information signals. The invention also relates to a program providing medium on which a program for transforming coded information signals is recorded.
2. Description of the Related Art
High-efficiency coding methods are known in which the amount of data for audio or sound signals is compressed with very little loss in the acoustic quality. Various high-efficiency coding methods for coding audio or sound signals are available and include, for example, the non-block frequency band division technique, i.e., the subband coding (SBC) technique, and the block frequency-band division technique, i.e., the transform coding technique. In the subband coding technique, an audio signal in the time domain is divided into a plurality of frequency bands and coded rather than forming the audio signal into blocks. In the transform coding technique, a signal in the time domain is transformed (spectrum-transformed) into a signal in the frequency domain, and is divided into a plurality of frequency bands. The signal component in each band is then coded.
As a filter used for frequency division, a quadrature mirror filter (QMF) may be used, which is discussed in the technical document xe2x80x9cR. E. Crochiere, Digital coding of speech in subbands, Bell Syst. Tech. J., Vol. 55, No. 8, 1976xe2x80x9d.
In the above-described QMF filter, aliasing components generated by signals decimated to a half rate after performing frequency division are canceled by aliasing components generated when the signals in the respective bands are synthesized. Because of this characteristic, the loss incurred by coding can be almost completely eliminated if the signal components in the respective bands are coded with a sufficiently high precision.
In the technical document xe2x80x9cJoseph H. Rothweiler, Polyphase Quadrature filtersxe2x80x94A new Subband coding technique, ICASSP 83, BOSTON, 1983xe2x80x9d, a polyphase quadrature filter (PQF) filter used in the equal-bandwidth filter division technique is described. In this PQF filter, aliasing components generated by the signal components between the adjacent bands, which are decimated to a rate in accordance with the bandwidth after performing frequency division, are canceled by aliasing components generated by the signal components between the adjacent bands when the signal components in the respective bands are synthesized. Because of this characteristic, the loss incurred by coding can be almost completely eliminated if the signal components in the respective bands are coded with a sufficiently high precision.
As the aforementioned spectrum transform, the following type of spectrum transform, for example, is known. An input audio signal is formed into blocks with a predetermined unit time (frame), and discrete Fourier transform (DFT), discrete cosine transform (DCT), modified discrete cosine transform (MDCT), etc. may be performed on the signal component in each block, thereby transforming the time domain signal into signal components in the frequency domain. The MDCT is discussed in, for example, the technical document xe2x80x9cJ. P. Princen, A. B. Bradley, Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation, ICASSP 1987, Univ. of Surrey Royal Melbourne Inst. of Tech.xe2x80x9d.
According to the above-described DFT or DCT used as a method for transforming a waveform signal into a spectrum, the waveform signal is transformed by a time block having M number of samples so as to obtain M independent items of real-number data. In order to reduce the connection distortion between the time blocks, samples are generally overlapped by M1 number of samples between adjacent blocks. Accordingly, in the DFT or DCT, M items of real-number data are quantized and coded in relation to (M-M1) number of samples.
On the other hand, according to the above-described MDCT used as a method for transforming a waveform signal into a spectrum, M independent items of real-number data are obtained from 2M samples having an overlapping portion between adjacent time blocks of M number of samples. Accordingly, in the MDCT, M items of real-number data are quantized and coded in relation to M samples. For example, in a decoder, the codes obtained by using the MDCT are inverse-transformed in the respective blocks so as to produce waveform elements. The waveform elements are then added while interfering with each other, thereby reconstructing the waveform signal.
Generally, by increasing the length of the time block used for transforming, the frequency resolution of the spectra is increased to concentrate energy in a specific spectral component. Thus, by using the MDCT in which transforming is conducted with a longer block having an overlapping portion by an amount of a half block between adjacent blocks, and in which the obtained number of spectral signal components is not more than the original number of time samples, coding can be performed with higher efficiency than in the aforementioned DFT or DCT.
Additionally, a sufficient length of the overlapping portions is provided between adjacent blocks, thereby reducing the interblock distortion of a waveform signal. However, a larger work area for transforming is required with an increased length of the transform block, thereby hampering the miniaturization of, for example, reproduction means. This causes an increase in cost, particularly when it is difficult to increase the integration level of a semiconductor.
According to the above description, signal components divided into the respective bands by using a filter or spectrum transform are quantized, which makes it possible to control the bands in which quantizing noise is generated. By further utilizing characteristics, such as the masking effect, acoustically higher-efficiency coding can be performed.
The masking effect is an effect in which louder sounds acoustically mask softer sounds. By utilizing this effect, the generated quantizing noise can be acoustically masked by the original signal sound. Thus, the sound quality of the compressed signal is almost the same as that of the original signal. For effectively utilizing the masking effect, however, it is necessary to control the generation of the quantizing noise in the time domain or in the frequency domain. For example, if quantizing noise is generated for a few microseconds or greater during a small magnitude of signal immediately before the attack portion in which the magnitude of signal sharply increases, it is no longer masked by the signal sound. This further brings about the loss of sound quality to such a degree as to be uncomfortable from an auditory point of view, which is referred to as xe2x80x9cpre-echoxe2x80x9d. To overcome this drawback, the block length used in transforming a waveform signal into spectral signal components is changed in accordance with the characteristics of the signal component in the corresponding block. Before performing quantization, the signal component in each band is normalized by the maximum of the absolute value of the signal component, thereby making it possible to perform higher-efficiency coding.
To determine the frequency division width used for quantizing the individual frequency components, a band division technique may be employed by considering human auditory characteristics. For example, in a bandwidth that increases toward the higher range, which is generally referred to as the xe2x80x9ccritical bandxe2x80x9d, the band division technique for dividing an audio signal into a plurality of, for example, 25 bands may be employed.
Data in each band is coded by performing predetermined bit allocation or adaptive bit allocation. For example, in coding coefficient data obtained by the aforementioned MDCT according to the above-described bit allocation, the MDCT coefficient data in each band obtained by performing the aforementioned MDCT on each block is coded with the adaptively allocated number of bits.
As the bit allocation methods, the following two methods are known.
In a first method, bit allocation is conducted based on the magnitude of a signal component in each band. According to this method, a quantizing noise spectrum becomes flat to minimize noise energy. However, since the masking effect is not utilized, the actual acoustic perception is not optimized. The above-described first method is discussed in the technical document xe2x80x9cR. Zelinski and P. Noll, Adaptive Transform Coding of Speech Signals, IEEE Transactions of Acoustics, Speech and Signal Processing, vol. ASSP-25, No. 4, August 1977xe2x80x9d.
In a second method, a signal-to-noise ratio required for each band is obtained by utilizing the auditory masking effect, and then, fixed bit allocation is performed. According to this method, however, bit allocation is fixed even when the characteristics are measured with a sine wave input, thus failing to exhibit very good characteristic values. The above-described second method is discussed in the technical document xe2x80x9cM. A. Kransner, MIT, The critical band coderxe2x80x94digital encoding of the perceptual requirements of the auditory system, ICASSP 1980xe2x80x9d.
To solve the problems inherent in the aforementioned methods, the following high-efficiency coding method has been proposed. All the bits usable for bit allocation are divided between the fixed bit allocation pattern in which a predetermined number of fixed bits are allocated to the individual small blocks, and the bit allocation pattern in which bits are allocated according to the magnitude of a signal component in each block. The division ratio between the patterns is determined by a spectral signal (for example, by a normalized signal) related to the input signal. The ratio of the fixed bit allocation pattern becomes higher in response to a smoother spectrum of the signal.
According to the above method, if sine waves are input, in which case, energy concentrates in a specific spectral component, a greater number of bits are allocated to the block containing such a specific spectral component, thereby remarkably improving the overall signal-to-noise ratio. Generally, humans are extremely sensitive to signals having sharp spectral components. Accordingly, by improving the signal-to-noise ratio by using the above method, not only the measurement values are improved, but also the sound quality is effectively enhanced.
In addition to the aforementioned method, many other bit allocation methods have been proposed. If acoustic models are becoming more precise with higher performance of coding apparatuses, higher-efficiency coding can be achieved from the auditory point of view. In such methods, generally, the real-number bit-allocation reference value that can most faithfully implement the calculated signal-to-noise ratio is determined, and the integer approximating the bit-allocation reference value is determined to be the number of bits to be allocated.
In constructing an actual code string, the quantizing precision information and the normalizing coefficient information for each band in which quantization and normalization are conducted are first coded with a predetermined number of bits, and the normalized and quantized spectral signal is then coded. In the technical document xe2x80x9cISO/IEC 11172-3: 1993(E)xe2x80x9d, a high-efficiency coding method in which the number of bits representing the quantizing precision information varies according to the band is described. This method is standardized in such a manner that the number of bits representing the quantizing precision information becomes smaller with respect to a higher frequency range.
Hitherto, instead of directly coding the quantizing precision information by a decoder, a method for determining the quantizing precision information from the normalizing coefficient information is known. In this method, however, when standards are set, the relationship between the normalizing coefficient information and the quantizing precision information is determined. It will thus be impossible to control the quantizing precision based on a higher-level acoustic model. Additionally, if there is a variation in the compression ratio, it is necessary to determine the relationship between the normalizing coefficient information and the quantizing precision information according to each compression ratio.
If an acoustic signal is formed of a plurality of channels, the aforementioned methods are applicable to the respective channels of the acoustic signal. For example, the above-described methods are applicable to an L channel corresponding to the left side speaker and to an R channel corresponding to the right side speaker. Also, the above methods may be used for a signal (L+R)/2 obtained by adding the signals of the respective L and R channels. Alternatively, high-efficiency coding may be performed by employing the above methods for a signal (L+R)/2 and a signal (Lxe2x88x92R)/2 obtained from the signals of the two channels.
By focusing attention on the fact that stereo sound is dominantly influenced by signals in the low frequency range, a method for narrowing the band of the signal (Lxe2x88x92R)/2 more than that of the signal (L+R)/2 may be considered. According to this method, efficient coding can be performed with a fewer number of bits while maintaining good acoustic stereo sound.
Another high-efficiency coding method for coding quantized spectral signals is known using a variable code, which is described in the technical document xe2x80x9cD. A. Huffman, A Method for Construction of Minimum Redundancy Codes, Proc. I.R.E., 40, p. 1098 (1952)xe2x80x9d.
The following method is considered. Tone components, which are particularly important from the acoustic point of view, i.e., signal components having energy concentrating around a specific frequency are removed from a spectral signal and are coded separately from the other spectral components. According to this method, audio signals can be efficiently coded with a high compression ratio with very little loss in acoustic quality.
In this manner, various methods for enhancing the coding efficiency are being progressively developed. By employing standards integrating a newly developed method, the recording period becomes longer, and audio signals having a higher quality can be recorded with the same recording period.
As a method for mapping time-series audio signals in the time domain or the frequency domain, a method for combining the aforementioned band division coding technique and the transform coding technique is considered. In this method, after band division is performed by using a band division filter, a signal component in each band is spectrum-transformed into a signal component in the frequency domain. Coding is then performed on the spectral signal component in each band.
The advantages of performing spectrum transform by, for example, the MDCT after conducting band division by a band division filter are as follows.
The transform block length can be optimally set for each band so as to optimize the generation of quantizing noise in the time domain or the frequency domain from an acoustic point of view, thereby improving the sound quality. Generally, spectrum transform, such as MDCT, is usually performed by a fast computation method, such as fast Fourier transform (FFT), which requires a memory area having a size proportional to the block length. For example, by performing spectrum transform on a signal, which has been decimated in proportion to the bandwidth of each band, after conducting band division, the number of samples of spectrum transform for obtaining the same frequency resolution can be decreased, thereby requiring only a smaller memory area for spectrum transform.
If it is desired that a coded signal be reproduced in a decoder having the smallest possible scale of hardware, though sound quality does not have to be high, signal data only in the low frequency range is processed, thereby achieving the above-mentioned result.
Thus, the compression method in which spectrum transform is performed by using a combination of the band division filter and spectrum transform, such as MDCT, can be implemented by using comparatively small-scale hardware. Accordingly, this type of compression method is very convenient for portable recorders. However, the amount of computation is increased since a large number of product-sum operations are required to implement the band division filter.
When code strings transmitted via a communication channel having a relatively small transmission capacity are to be recorded on a recording medium having a comparatively large recording capacity, or when code strings are to be transmitted via a communication channel having a large transmission capacity over a short period and are to be recorded at a high rate on a recording medium having a relatively large recording capacity, it is necessary to employ a high-efficiency coding method in such a communication channel. To meet the above requirement, spectrum transform having a long transform block is desirably used to obtain a high frequency resolution.
Additionally, when code strings are to be recorded on a recording medium having a relatively large capacity, spectrum transform having a comparatively short transform block is desirably employed in order to implement coding or decoding in comparatively small-scale hardware. In particular, when code strings are to be recorded on a recording medium for use in portable machines, it is convenient that spectrum transform be performed after conducting band division in order to reduce the memory size of a decoder. If the signal transmitted via a communication channel is completely decoded and reproduced into the time-series signal, and the time-series signal is then coded to obtain a code string for a recording medium, a predetermined code string can be recorded on the recording medium. However, this requires the processing of the band division filter, which increases the amount of computation. In particular, when the code string is to be transmitted via a communication channel having a comparatively large transmission capacity over a short period and is to be recorded on a recording medium having a relatively large recording capacity, it is necessary to perform fast code-string transform. However, by performing band division processing, which requires a large amount of computation, the time required for recording the code string on the recording medium becomes longer.
In particular, when transforming signals of a plurality of channels, a greater amount of processing is required. This makes it more difficult to perform fast transform by using conventional methods.
Accordingly, in view of the above background, it is an object of the present invention to provide an information coding method and apparatus, a coding transform method and apparatus, a code transform control method and apparatus, and an information recording method and apparatus, all in which fast data transform is performed by enabling intercode data transform, and also to provide a program providing medium on which a program implementing one of the above-described methods is recorded.
In order to achieve the above object, according to one aspect of the present invention, there is provided an information coding method including: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed with a first block length after a time-series information signal had been divided into a first band group; a decoding step of decoding the input first code string into the spectral signal; a spectral-signal transform step of transforming the decoded spectral signal into a spectral signal which is transformed with a second block length after being divided into a second band group; and a coding step of coding the transformed spectral signal into a second code string.
According to another aspect of the present invention, there is provided an information coding apparatus including input means for inputting a first code string obtained by coding a spectral signal which has been transformed with a first block length after a time-series information signal had been divided into a first band group. Decoding means decodes the input first code string into the spectral signal. Spectral-signal transform means transforms the decoded spectral signal into a spectral signal which is transformed with a second block length after being divided into a second band group. Coding means codes the transformed spectral signal into a second code string.
According to still another aspect of the present invention, there is provided a program providing medium for providing an information coding program to an information processing apparatus. The information coding program includes: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed with a first block length after a time-series information signal had been divided into a first band group; a decoding step of decoding the input first code string into the spectral signal; a spectral-signal transform step of transforming the decoded spectral signal into a spectral signal which is transformed with a second block length after being divided into a second band group; and a coding step of coding the transformed spectral signal into a second code string.
According to a further aspect of the present invention, there is provided a code transform method including: an input step of inputting a first code string obtained by coding time-series information signals corresponding to a plurality of channels; and a transform step of transforming the input first code string into a second code string which reproduces the time-series information signals having the same higher band when being decoded.
According to a yet further aspect of the present invention, there is provided a code transform apparatus including input means for inputting a first code string obtained by coding time-series information signals corresponding to a plurality of channels. Transform means transforms the input first code string into a second code string which reproduces the time-series information signals having the same higher band when being decoded.
According to a further aspect of the present invention, there is provided a code transform method including: an input step of inputting a first code string obtained by coding time-series information signals corresponding to a plurality of channels; and a transform step of transforming the input first code string into a second code string by reproducing the plurality of channels of the time-series information signals having the same higher band and by assigning weights to the respective channels of the information signals.
According to a further aspect of the present invention, there is provided a code transform apparatus including input means for inputting a first code string obtained by coding time-series information signals corresponding to a plurality of channels. Transform means transforms the input first code string into a second code string by reproducing the plurality of channels of the time-series information signals having the same higher band and by assigning weights to the respective channels of the information signals.
According to a further aspect of the present invention, there is provided a program providing medium for providing a code transform program to an information processing apparatus. The code transform program includes: an input step of inputting a first code string obtained by coding time-series information signals corresponding to a plurality of channels; and a transform step of transforming the input first code string into a second code string which reproduces the time-series information signals having the same higher band when being decoded.
According to a further aspect of the present invention, there is provided a program providing medium for providing a code transform program to an information processing apparatus. The code transform program includes: an input step of inputting a first code string obtained by coding time-series information signals corresponding to a plurality of channels; and a transform step of transforming the input first code string into a second code string by reproducing the plurality of channels of the time-series information signals having the same higher band and by assigning weights to the respective channels of the information signals.
According to a further aspect of the present invention, there is provided a code transform control method in which a plurality of code transform operations for transforming a first code string into a second code string are selectable. The code transform control method includes the step of selecting one of the plurality of code transform operations based on input transform-operation-rate control information.
According to a further aspect of the present invention, there is provided a code transform control apparatus including a plurality of code transform operation means for transforming a first code string into a second code string. Code transform selection means selects one of the plurality of code transform operation means based on input transform-operation-rate control information.
According to a further aspect of the present invention, there is provided a program providing medium for providing a code transform control program to an information processing apparatus. In the code transform control program, a plurality of code transform operations for transforming a first code string into a second code string are selectable. The information coding program includes a step of selecting one of the plurality of code transform operations based on input transform-operation-rate control information.
According to a further aspect of the present invention, there is provided an information recording method including: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed with a first block length after a time-series information signal had been divided into a first band group; a transform step of transforming the first code string into a second code string obtained by coding a spectral signal which has been transformed with a second block length after being divided into a second band group; and a recording step of recording the second code string on a recording medium.
According to a further aspect of the present invention, there is provided an information recording apparatus including input means for inputting a first code string obtained by coding a spectral signal which has been transformed with a first block length after a time-series information signal had been divided into a first band group. Transform means transforms the first code string into a second code string obtained by coding a spectral signal which has been transformed with a second block length after being divided into a second band group. Recording means records the second code string on a recording medium.
According to a further aspect of the present invention, there is provided a program providing medium for providing an information recording program to an information processing apparatus. The information recording program includes: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed with a first block length after a time-series information signal had been divided into a first band group; a transform step of transforming the first code string into a second code string obtained by coding a spectral signal which has been transformed with a second block length after being divided into a second band group; and a recording step of recording the second code string on a recording medium.
According to a further aspect of the present invention, there is provided a code transform method including: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed after a time-series information signal had been divided into a first band group; a decoding step of decoding the input first code string into the spectral signal; a spectral-signal transform step of transforming the decoded spectral signal into a spectral signal which is transformed after being divided into a second band group by inverse-transforming part of the spectral signal of a higher band into a decimated time-series signal and by transforming the decimated time-series signal into a lower-band spectral signal within the higher band; and a coding step of coding the transformed spectral signal into a second code string.
According to a further aspect of the present invention, there is provided a code transform apparatus including input means for inputting a first code string obtained by coding a spectral signal which has been transformed after a time-series information signal had been divided into a first band group. Decoding means decodes the input first code string into the spectral signal. Spectral-signal transform means for transforming the decoded spectral signal into a spectral signal which is transformed after being divided into a second band group by inverse-transforming part of the spectral signal of a higher band into a decimated time-series signal and by transforming the decimated time-series signal into a lower-band spectral signal within the higher band. Coding means codes the transformed spectral signal into a second code string.
According to a further aspect of the present invention, there is provided a code transform method including: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed from a time-series information signal; a decoding step of decoding the input first code string into the spectral signal; a spectral-signal transform step of inverse-transforming only lower-band spectral components of the decoded spectral signal into a decimated time-series signal and of transforming the decimated time-series signal into lower-band spectral components of a second code string; and a coding step of coding the transformed spectral signal.
According to a further aspect of the present invention, there is provided a code transform apparatus including input means for inputting a first code string obtained by coding a spectral signal which has been transformed from a time-series information signal. Decoding means decodes the input first code string into the spectral signal. Spectral-signal transform means inverse-transforms only lower-band spectral components of the decoded spectral signal into a decimated time-series signal and transforms the decimated time-series signal into lower-band spectral components of a second code string. Coding means codes the transformed spectral signal.
According to a further aspect of the present invention, there is provided a program providing medium for providing a code transform program to an information processing apparatus. The code transform program includes: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed after a time-series information signal had been divided into a first band group; a decoding step of decoding the input first code string into the spectral signal; a spectral-signal transform step of transforming the decoded spectral signal into a spectral signal which is transformed after being divided into a second band group by inverse-transforming part of the spectral signal of a higher band into a decimated time-series signal and by transforming the decimated time-series signal into a lower-band spectral signal within the higher band; and a coding step of coding the transformed spectral signal into a second code string.
According to a further aspect of the present invention, there is provided a program providing medium for providing a code transform program to an information processing apparatus. The code transform program includes: an input step of inputting a first code string obtained by coding a spectral signal which has been transformed from a time-series information signal; a decoding step of decoding the input first code string into the spectral signal; a spectral-signal transform step of inverse-transforming only lower-band spectral components of the decoded spectral signal into a decimated time-series signal and of transforming the decimated time-series signal into lower-band spectral components of a second code string; and a coding step of coding the transformed spectral signal.