1. Field of the Invention
This invention relates to an information encoding method for encrypting and encoding information signals, such as PCM audio signals, a recording medium having encoded signal recorded thereon, and a decoding device for decoding the encoded signals.
2. Description of the Related Art
There is so far known a method of software circulation in which information signals, such as acoustic signals or video signals, are encrypted for broadcasting or recorded on a recording medium so that only a person who has purchased a key is permitted to view and hear the signals. As a method for encryption, there is known a method of giving a bitstring of PCM acoustic signals an initial value of a random number string as a key signal and to transmit or record on a recording medium a bitstring corresponding to a logical sum of the generated 0/1 random numbers and the above-mentioned PCM bitstring. By using this method, only a person who has acquired the key signal can reproduce the acoustic signals correctly, while another who has not acquired the key signal can reproduce only the noise.
There is also widespread a method of compressing and broadcasting acoustic signals or recording the compressed signals on a recording medium, such that a recording medium capable of recording encoded audio or speech signals thereon, such as a magneto-optical disc, is used extensively. Among the methods for high-efficiency encoding of audio or speech signals, there are a sub-band encoding (SBC) method, which is a non-blocking frequency spectrum splitting method of splitting the time-domain audio signals into plural frequency bands without blocking and encoding the resulting signals of the frequency bands, and a so-called transform coding which is a blocking frequency spectrum splitting method of transforming time-domain signal into frequency domain signals by orthogonal transform and encoding the spectral components from one frequency band to another. There is also known a high-efficiency encoding technique which is a combination of the sub-band coding and transform coding, in which case the time domain signals are split into plural frequency bands by SBC and the resulting band signals are orthogonal transformed into spectral components which are encoded from band to band.
Among the above-mentioned filters is a so-called QMF filter as discussed in 1976, R. E. Crochiere, Digital Coding of Speech in subbands, Bell Syst. Tech. J. Vol.55, No.8, 1976. The technique of dividing the frequency spectrum is discussed in Joseph H. Rothweiler, Polyphase Quadrature Filters--A New Subband Coding Technique, ICASSP 83 BOSTON.
Among the above-mentioned techniques for orthogonal transform is such a technique in which an input audio signal is blocked every pre-set unit time, such as every frame, and discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified DCT (MDCT) is applied to each block for converting the signals from the time axis to the frequency axis. Discussions of the MDCT are found in J. P. Princen and A. B. Bradley, Subband/Transform coding Using Filter Bank Based on Time Domain Aliasing Cancellation, ICASSP 1987.
If the above-mentioned DFT or DCT is used as a method for transforming waveform signals into spectral signals, and transform is applied based on a time block composed of M samples, M independent real-number data are obtained. It is noted that, for reducing junction distortions between time blocks, a given time bock is usually overlapped with M1 samples with both neighboring blocks, and M real-number data on an average are quantized and encoded in DFT or DCT for (M-M1) samples.
On the other hand, if the above-mentioned MDCT is used as a method for orthogonal transform, M independent real-number data are obtained from 2 M samples overlapped with N samples of both neighboring time blocks. Thus, in MDCT, M real-number data on an average are quantized and encoded for M samples. A decoding device adds waveform elements obtained on inverse transform in each block from the codes obtained by MDCT with interference for re-constructing the waveform signals.
In general, if a time block for transform is lengthened, the spectrum frequency resolution is improved such that the signal energy is concentrated in specified frequency components. Therefore, by using MDCT in which, by overlapping with one half of each of both neighboring blocks, transform is carried out with long block lengths, and in which the number of the resulting spectral signals is not increased beyond the number of the original time samples, encoding can be carried out with higher efficiency than if the DFT or DCT is used. Moreover, since the neighboring blocks have sufficiently long overlap with each other, the inter-block distortion of the waveform signals can be reduced.
By quantizing signals split into plural frequency bands by a filter or orthogonal transform, the frequency band in which occurs the quantization noise can be controlled so that encoding can be achieved with psychoacoustic higher efficiency by exploiting acoustic characteristics such as masking effects. If the signal components are normalized with the maximum values of the absolute values of the signal components in the respective bands, encoding can be achieved with a still higher efficiency.
As frequency bands of quantizing the frequency components obtained on splitting the frequency spectrum, it is known to split the frequency spectrum in such a manner as to take account of the psychoacoustic characteristics of the human auditory system. Specifically, the audio signals are divided into a plurality of, such as 25, bands using bandwidths increasing with increasing frequency. These bands are known as critical bands. In encoding the band-based data, encoding is carried out by fixed or adaptive bit allocation on the band basis. In encoding coefficient data obtained by MDCT processing, encoding is by adaptive number of bit allocation for band-based MDCT coefficients obtained by block-based MDCT processing.
As these bit allocation techniques, there are known two techniques described in R. Zelinsky and P. Noll, Adaptive Transform Coding of Speech Signals in `IEEE Transactions of Acoustics, Speech and Signal Processing, vol. ASSP-25, No.4, August 1977.
In the techniques disclosed in these publications, bit allocation is based on the amplitudes of signals of the respective bands. This technique produces a flat quantization noise spectrum and minimizes the noise energy, but the noise level perceived by the listener is not optimum because the technique does not effectively exploit the psychoacoustic masking effect.
In a publication `ICASSP 1980, The critical band coder-digital encoding of the perceptual requirements of the auditory system, M. A. Krasner, MIT`, the psychoacoustic masking mechanism is used to determine a fixed bit allocation that produces the necessary signal-to-noise ratio for each critical band. However, if this technique is used to measure characteristics of a sine wave input, non-optimum results are obtained because of the fixed allocation of bits among the critical bands.
For overcoming these problems, there is proposed a high-efficiency encoding device in which the total number of bits usable for bit allocation is separately used for a fixed bit allocation pattern pre-fixed from one small black to another and for bit allocation dependent on the signal amplitudes of the respective blocks and the bit number division ratio between the fixed bit allocation and the bit allocation dependent on the signal amplitudes is made dependent on a signal related to an input signal such that the bit number division ratio to the fixed bit allocation becomes larger the smoother the signal spectrum.
This technique significantly improves the signal-to-noise ratio on the whole by allocating more bits to a block including a particular signal spectrum exhibiting concentrated signal energy. Since the human auditory mechanism is sensitive to signals having acute spectral components, not only the measured values are increased, but also the sound quality as perceived by the listener is improved by improving the signal-to-noise ratio characteristics by employing the above technique.
A variety of different bit allocation techniques have been proposed and a model simulating the human auditory mechanism has also been refined such that perceptually higher encoding efficiency can be achieved supposing that the encoding device capability is improved. These techniques in general use a method of finding real-number bit allocation reference value realizing the signal-to-noise ratio characteristics as found by calculations as faithfully as possible and using an integer value approximating the reference value as the number of allocated bits.
In Japanese Laid-Open Patent application 7-500482, there is disclosed a method of separating perceptually critical tonal components, that is signal components having the signal energy concentrated in the vicinity of a specified frequency, from the spectral signals, and encoding these signal components separately from the remaining spectral components. This enables audio signals to be efficiently encoded with a high compression ratio without substantially deteriorating the psychoacoustic sound quality.
In constructing an actual codestring, it suffices to encode the quantization fineness information and the normalization coefficient information with pre-set numbers of bits from one area for normalization and quantization to another and to encode the normalized and quantized spectral signals.
In the high-efficiency encoding system in which the number of bits specifying the quantization fineness information differs with the frequency bands, as disclosed in MPEG standard ISO/IEC 11172-3:1993 (E), 1993. The standard is set so that the number of quantization bits specifying the quantization fineness information is decreased with increasing frequency.
There is also known a method of determining the quantization fineness information from the normalization coefficient information in a decoding device instead of directly encoding the quantization fineness information. Since the relation between the normalization coefficient information and the quantization fineness information is set at a time point of setting the standard, it becomes impossible to introduce quantization fineness which is based on a more advanced perceptual model in future. Moreover, if there is a certain width in the compression ratio to be realized, it becomes necessary to set the relation between the normalization coefficient information and the quantization fineness information from one compression ratio to another.
There is also known a method of encoding quantized spectral signals using a variable length codes discussed in D. A. Huffman: `A Method for Construction of Minimum Redundancy Codes, Proc. I.R.E., 40, p.1098 (1952) for realizing more efficient encoding.
The signals encoded as described above can also be encrypted and circulated as in the case of the PCM signals. In this case, a person who has not acquired key signals cannot reproduce original signals. There is also a method of converting the PCM signals into random signals for compression encoding instead of encrypting the coded bitstring. In this case, too, a person who has not acquired key signals cannot reproduce any other signal than noise.
With these scrambling methods, the original signals reproduced in the absence of the key signals or by a usual reproducing means become noise such that the contents of the software cannot be understood. The result is that the scrambling methods cannot be used for the purpose of distributing a disc having recorded thereon the music with lower sound quality for allowing a hearer to purchase the key only for music pieces that meets his or her taste to reproduce the same music piece with high sound quality, or allowing the hearer to tentatively hear the music software piece before newly purchasing a disc having recorded thereon the same music piece with high sound quality.
Moreover, it has so far been difficult to encrypt the high-efficiency encoded signals to evade lowering of the compression efficiency despite the fact that the codestring as given is meaningful for usual reproducing means. That is, if a codestring obtained on high-efficiency encoding is scrambled, not only is the noise produced on reproduction of the codestring, but also the reproducing means occasionally cannot operate if the codestring obtained on scrambling is not in meeting with the standard for the original high-efficiency encoded signals.
Conversely, if, when the PCM signals are high-efficiency encoded prior to scrambling, the information volume is diminished by exploiting, for example, the psychoacoustic characteristics of the human auditory system, the scrambled PCM signals cannot necessarily be reproduced at a time point of decoding the high-efficiency encoded signals to render it difficult to descramble the signals correctly. Thus it has been necessary to select a compression method which enables correct descrambling despite the lowered efficiency