1. Field of the Invention
The present invention relates to a method for coding signals for allowing users to play back preview (trial) data, and for implementing high-quality recording/playback operations by adding a small amount of data if a user decides to purchase corresponding data.
2. Description of the Related Art
According to known software distribution methods, audio-visual data is broadcasted by encrypting corresponding signals, or data is recorded in recording media, and only users who have purchased a certain key are allowed to view or listen to the data. As an encryption method, the initial value of a random number sequence is given to a bit string of a pulse code modulation (PCM) audio signal as a key signal, and a bit string obtained by performing an exclusive OR of the generated 0/1 random number sequence and the above-described PCM bit string is transmitted or recorded in a recording medium. According to this method, only the users who have obtained the key signal are allowed to correctly play back the audio signal, and those who have not obtained the key signal are unable to play back the audio signal, and only noise is heard. A more complicated encryption method may be employed by using, for example, the Data Encryption Standard (DES).
Details of DES are described in Federal Information Processing Standards Publication 46, Specifications for the DATA ENCRYPTION STANDARD, Jan. 15, 1977.
Methods for compressing audio signals and then broadcasting corresponding data or recording the data in a recording medium are available. Accordingly, recording media, such as magneto-optical disks, are widely used for recording coded audio signals. There are various techniques for coding audio signals with high efficiency. For example, in a block-less frequency-band division technique, i.e., a so-called “sub-band coding (SBC)”, an audio signal in the time domain is divided into a plurality of frequency bands and coded without dividing them into blocks. In a block frequency-band division technique, i.e., a so-called “transform coding”, a signal in the time domain is transformed (spectrum transform) into a signal in a frequency domain so as to be divided into a plurality of frequency bands. The signal components are then coded in each band. Another high-efficiency coding technique, which is a combination of the above-described sub-band coding and transform coding, has also been considered. In this case, for example, after sub-band division is performed in the above-described SBC, signal components in each sub band are transformed into signal components in the frequency domain, and are then coded in each band.
Filters used in the above-described high-efficiency coding methods include quadrature mirror filters (QMF), details of which are described in R. E. Crochiere, Digital coding of speech in subbands, Bell Syst. Tech. J., vol. 55, No. 8, 1976.
An equal-bandwidth filtering technique is described in ICASSP 83, BOSTON, Polyphase Quadrature filters—A new subband coding technique, Joseph H. Rothweiler.
As the above-described spectrum transform, for example, an input audio signal is formed into blocks in predetermined time units (frames), and discrete Fourier transform (DFT), discrete cosine transform (DCT), or modified DCT (MDCT) is performed on the signal components in each block, thereby transforming a time-domain signal into a frequency-domain signal.
Details of MDCT are described in ICASSP 1987, Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation, J. P. Princen, A. B. Bradley, Univ. of Surrey, Royal Melbourne Inst. of Tech.
In the spectrum transform using the above-described DFT or DCT, when the spectrum transform is performed in a time block consisting of M samples, M items of independent real-number data are obtained. Generally, in order to reduce distortion at the connections between time blocks, M1 samples overlap between adjacent blocks, and thus, on average, in DFT or DCT, M items of real-number data are quantized and coded for (M-M1) samples.
In contrast, in the spectrum transform using the above-described MDCT, M items of independent real-number data are obtained from 2M samples, in which M samples overlap with half of adjacent blocks. Accordingly, in MDCT, M items of real-number data are quantized and coded for M samples. In a decoding apparatus, coded data obtained by performing MDCT is inverse-transformed in each block, and the resulting waveform components are added together while interfering with each other so as to reconstruct a waveform signal.
Generally, the spectrum frequency resolution is enhanced as the time block for spectrum transform becomes longer, thereby allowing energy to be concentrated in specific spectral components. As described above, in MDCT, the spectrum transform is performed with an increased block length by overlapping samples between adjacent blocks, and the number of spectral signal components remains the same as the original number of samples. By using such MDCT, coding can be performed with higher efficiency than by using DFT or DCT. Also, by allowing a sufficiently long overlapping portion between adjacent blocks, inter-block distortion of the waveform signal can be reduced.
By quantizing signal components divided into sub bands by using a filter or spectrum transform, bands in which quantizing noise is generated can be controlled, and high-efficiency coding can be performed by utilizing the masking effect. Before performing quantizing, if signal components in each band are normalized by the maximum of the absolute values of the signal components in the corresponding band, higher efficiency coding can be performed.
Signal components are divided into frequency bands with bandwidths considering, for example, human acoustic characteristics. That is, generally, an audio signal is divided into a plurality of bands (for example, 25 bands) so that the bandwidth of the higher bands, which are referred to as the “critical bands”, becomes greater. Then, data in each band is coded according to a predetermined bit distribution or an adaptive bit allocation. For example, coefficient data obtained by the above-described MDCT processing in each block is coded by the number of adaptively allocated bits.
The following two bit allocation techniques are known.
One technique is disclosed in Adaptive Transform Coding of Speech Signals, R. Zelinski and P. Noll, IEEE Transactions of Acoustics, Speech, and Signal Processing, Vol. ASSP-25, No. 4, August 1977. In this technique, bit allocation is performed according to the magnitude of the signal in each band, and thus, the quantizing noise spectrum becomes smooth to minimize the noise energy. However, since the masking effect is not employed, the actual sound is not acoustically optimal.
The other technique is disclosed in ICASSP 1980, The Critical band coder—digital encoding of the perceptual requirements of the auditory system, M. A. Kransner, MIT. In this method, by utilizing the masking effect, fixed bit allocation is performed by obtaining a signal-to-noise (S/N) ratio required for each band. However, due to the fixed bit allocation, even when the characteristic of a sinusoidal wave input is measured, a precise value cannot be obtained.
In order to overcome the above drawbacks, the following high-efficiency coding apparatus has been proposed. Fixed bit allocation is partially performed on some blocks, and adaptive bit allocation is partially performed so that bits determined by the magnitudes of the signal components in the other blocks are allocated to the corresponding blocks. The division ratio of the two types of bit allocations is determined by an input signal, and the division ratio of the fixed bit allocation becomes higher as the signal spectrum becomes smoother.
According to the above-described coding apparatus, many bits can be allocated to blocks containing specific spectral components, such as sinusoidal waves, in which energy is concentrated, thereby making it possible to considerably improve the overall S/N ratio characteristics. Generally, the human acoustic characteristics are extremely sensitive to signals having sharp spectral components. Accordingly, an improved S/N ratio by using this method is effective not only in enhancing precise measurements, but also in improving the sound quality.
Many other bit allocation techniques have been proposed, and acoustic models are becoming increasingly precise. Accordingly, if the performance of a coding apparatus becomes higher, even higher efficiency coding is possible. In these methods, the bit-allocation real-number reference value is determined so that the calculated S/N ratio can be faithfully achieved, and the integer approximating the reference value is used as the number of allocation bits.
In International Publication No. WO94/28633 (corresponding to U.S. Pat. No. 5,717,821) filed by the present inventors, another coding method has been proposed in which tone components that are particularly important in an acoustic sense, i.e., signal components in which energy is concentrated, are extracted from a spectrum signal, and are separately coded from the other spectral components. According to this coding method, audio signals can be efficiently coded with a high compression ratio with very little degradation.
In forming code strings, quantizing-precision information and normalizing-coefficient information are coded with a predetermined number of bits in each band, and the resulting normalized and quantized spectrum signal is coded.
A high-efficiency coding method in which the number of bits representing the quantizing precision differs according to the band is described in ISO/IEC 11172-3: 1993(E), 1993. In this standard, the number of bits indicating the quantizing-precision information becomes smaller as the band becomes higher.
Instead of directly coding quantizing precision information, the quantizing-precision information may be determined from the normalizing-coefficient information in a decoding apparatus. According to this method, however, the relationship between the normalizing-coefficient information and the quantizing-precision information is determined when the standard is set, which makes it impossible to introduce the quantizing precision based on more precise acoustic models in the future. Additionally, if the compression ratio has a range, the relationship between the normalizing-coefficient information and the quantizing-precision information has to be determined according to each range.
Another known coding method is disclosed in D. A. Huffman: A Method for Construction of Minimum Redundancy Codes, Proc. I.R.E., 40, p.1098 (1952). In this method, a quantized spectrum signal is coded more efficiently by using variable codes.
The signal coded as described above can be encrypted and distributed, as in PCM signals, in which case, those who have not obtained the corresponding key are unable to play back the original signal. Alternatively, instead of encrypting a coded bit string, a PCM signal may be converted into a random signal, which is then coded for compression. It is also impossible for users who have not obtained the corresponding key to play back the original signal.
In this so-called-scrambling method, however, without the key, the software data cannot be checked. Also, if the user plays back the data with regular playback means, only noise is heard. Accordingly, this method cannot be used for, for example, the following application. A disk in which music with a relatively low audio quality is recorded is distributed, and after listening to the disk, the user purchases the key only for a music piece that he or she likes and is able to play back that music piece with high audio quality. Alternatively, after listening to the distributed disk, the user is able to purchase a new disk in which music is recorded at high quality.
When encrypting signals subjected to high-efficiency coding, it is very difficult to maintain the compression efficiency while providing code strings that are meaningful for regular playback means. That is, when a scrambled code string is played back, as described above, only noise is heard, and also, playback means may not operate at all if the scrambled code string is not compatible with the original high-efficiency code standard. Also, if a scrambled PCM signal is coded with high efficiency by decreasing the amount of information by utilizing the acoustic characteristics, the scrambled PCM signal cannot always be reproduced when the coded signal is decoded. Thus, it is difficult to descramble the signal. Accordingly, a method for precisely descrambling the signal must be employed by sacrificing the compression efficiency.
U.S. Pat. No. 6,081,784 or Japanese Unexamined Patent Application Publication No. 10-135944 filed by the present inventors discloses the following audio coding method. In this method, among spectral signal components coded from a music signal, signal components only in higher bands are encrypted, thereby enabling users to play back a preview file (trial file) without a corresponding key. More specifically, in this method, signal components only in higher bands are encrypted, and also, high-band bit allocation information is replaced by dummy data, true bit allocation information being recorded in a position ignored by decoders. According to this method, the users are able to enjoy music pieces that please them with high audio quality after listening to the preview file.
In this method, however, since the security is uniquely dependent on encryption, if the data is deciphered, high-quality music can be illegally played back.
In order to overcome this drawback, International Publication No. WO02/065449 discloses the following method. Part of the information to be recorded in a recording medium is replaced by dummy data so that it can be played back at relatively low quality, and when it becomes necessary to play back the information with high quality, the dummy data is replaced by true data, thereby eliminating the possibility of the data being deciphered. Additionally, data can be played back by regular playback devices regardless of whether the data is recorded with high quality or low quality. According to this method, the data can be played back with high quality after checking the content while enhancing the security compared to a method employing an encryption key.
According to the above method, however, dummy data must be replaced by high quality data, and the amount of such data is large, though it is smaller than a preview file. Accordingly, it takes time to transmit the required data, thereby increasing the overall time.
Thus, in International Application No. PCT/JP03/04526 or Japanese Patent Application No. 2002-107084, the present inventors have proposed that part of the dummy data be contained in preview data, and the amount of true data in a high quality file required for replacing the dummy data when the data is played back with high quality is reduced, thereby decreasing the overall time. To implement this, when the number of coding units indicating the number of bands to be coded is contained in an original code string, a small value is written in the original code string as dummy data so that the code string is played back in accordance with the dummy data, and part of the true coding information is recorded in a position ignored by a decoder. With this method, the amount of data required for achieving a high-quality playback operation can be reduced.
If high quality data is encrypted and then sold, network transactions are possible.
According to the above-described method, the true coding information (true band information) has to be recorded in a high-quality file. If the bandwidth is changed in every frame when coding the data, such a change becomes noticeable as noise. Thus, the true band information is normally set to the same value without being changed. Then, when decrypting the key information of the high-quality file, it is highly likely that the same band information appears in regular positions. Such an analysis can be automatically performed by using, for example,  acomputer, thereby increasing the possibility of the high-quality file being deciphered.