1. Field of the Invention
The present invention relates to a signal reproducing method and device, a signal recording method and device, and a code sequence generating method and device. For example, the invention relates to a signal reproducing method and device, a signal recording method and device, and a code sequence generating method and device for coding a signal so as to enable trial viewing and, if a trial viewer decides to buy an item concerned, enabling high-quality reproduction and recording by adding data having a small amount of information.
2. Description of the Related Art
The application is based on and claims priority (under the Paris Convention, Article 4) from Japanese Patent Application Nos. 2002-067481 (filed Mar. 12, 2002), 2002-107083 (filed Apr. 9, 2002), and 2002-114784 (filed Apr. 17, 2002), the disclosures of all of which are incorporated herein by reference in their entireties.
Because of the spread of communication network technologies such as the Internet, the improvement of information compression technologies, the increase in the degree of integration (integration density) of information recording media, and other factors, a marketing form is now available in which digital contents formed by various kinds of multimedia data such as data of audio, a still image, a moving image, a movie consisting of audio and a moving image are delivered to viewers over communication networks with charge.
For example, stores that sell package media such as CDs (compact discs) and MDs (mini-disks)(trademark), that is, recording media on which digital contents are recorded in advance can sell not only package media but also digital contents themselves by installing there an information terminal such as what is called an MMK (multimedia KIOSK) in which a large number of digital contents as typified by music data are stored.
A user inserts a recording medium he brought such as an MD into the MMK, selects the title of a digital content he wants to buy by referring to a menu picture or the like, and pays for a requested price of the content. The payment method may be input of cash, use of digital money, or electronic payment using a credit card or a prepaid card. The MMK records, by performing prescribed processing, the selected digital content data on the recording medium the user inserted.
A marketer of digital contents can deliver digital contents to users over the Internet, for example, as well as sell digital contents to users using MMKs as described above.
It has become possible to distribute contents more effectively by employing the above-described method of marketing not only package media on which contents are recorded in advance but also digital contents themselves.
JP-A-2001-103047, JP-A-2001-325460, etc. disclose techniques that enable distribution of digital contents while protecting its copyright. These techniques make it possible to deliver a digital content in such a manner that portions other than a portion for trial listening are encrypted and allow only a user who has bought a corresponding decoding key to listen to all the content. One known encryption method is such that an encrypted bit string is obtained by EXCLUSIVE-ORing PCM (pulse code modulation) digital audio data to be delivered with a 0/1 random number series generated by giving an initial value of a random number series as a key signal for the bit string of the PCM data. Digital contents that have been encrypted in this manner are distributed to users in a manner that they are recorded on recording media by using MMKs or the like and delivered over networks. A user who has acquired encrypted digital content data can listens to only a non-encrypted, trial-allowed portion unless he gets a key. The user receives only noise if he reproduces an encrypted portion without decoding it.
As described above, the encryption method is known in which a bit string obtained by EXCLUSIVE-ORing the bit string of a PCM acoustic signal with a 0/1 random number series generated by giving initial values of a random number series as a key signal for the bit string of the PCM acoustic signal is transmitted or recorded on a recording medium. This method makes it possible to allow only a person who has acquired the key signal to reproduce the acoustic signal correctly and to cause a person who has not acquired the key signal to be able to reproduce only noise. Naturally, it is possible to use, as an encryption method, a more complex method such as what is called DES (data encryption standard). The content of the DES is disclosed in Federal Information Processing Standards Publication 46, Specifications for the DATA ENCRYPTION STANDARD, Jan. 15, 1977.
Incidentally, methods for broadcasting an audio signal after compressing it or recording an audio signal on a recording medium are spread and magneto-optical discs capable of recording a coded audio, speech, or like signal are used widely.
Various methods for high-efficiency coding of audio data are known, examples of which are subband coding (SBC) in which an audio signal on the time axis is coded by dividing it into a plurality of frequency bands without dividing it into blocks and blocked frequency band division coding (what is called transform coding) in which a signal on the time axis is spectrum-converted into a signal on the frequency axis which is then divided into a plurality of frequency bands and coded on a band-by-band basis. Another method is available in which a signal is subjected to subband coding and a resulting signal in each band is spectrum-converted into a signal on the frequency axis and coded in each spectrum conversion band.
Among filters used in the above methods is a QMF (quadrature mirror filter), which is disclosed in R. E. Crochiere: “Digital Coding of Speech in Subbands,” Bell Syst. Tech. J., Vol. 55, No. 8, 1974. Joseph H. Rothweiler: “Polyphase Quadrature Filters—A New Subband Coding Technique,” ICASSP 83, Boston, for example, discloses a filter division technique using filters having the same bandwidth.
An example of the above-mentioned spectrum conversion is a method in which an input audio signal is blocked into unit frames of a predetermined duration and is subjected to discrete Fourier transform (DFT), discrete cosine transform (DCT), modified DCT transform (MDCT), or the like on a unit frame basis. The details of MDCT are described in J. P. Princen, A. B. Bradley (Univ. of Surrey Royal Melbourne Inst. of Tech.), et al., “Subband/Transform Coding Using Filter Band Designs Based on Time Domain Aliasing Cancellation,” ICASSP 1987.
Where the above-mentioned DFT or DCT is used as a method for spectrum-converting a waveform signal, if transform is performed for each time block including M samples, M independent real number data are obtained for each block. To reduce distortion due to connection between adjoining time blocks, adjoining blocks are usually overlapped with each other by N/2 samples; each block has N overlap samples (N/2 samples on each side). Therefore, in DFT or DCT, on average, M independent real number data are quantized and coded for M+N samples.
In contrast, where the above-mentioned MDCT is used as a spectrum conversion method, if transform is performed for each time block including M samples, M independent real number data are obtained from 2M samples because each block is overlapped with each of the two adjacent blocks by M/2 samples (M samples in total). Therefore, in MDCT, on average, M independent real number data are quantized and coded for M samples.
In a decoding device, a waveform signal can be reconstructed by combining together waveform components, while interfering with each other, obtained by inverse-converting individual blocks of a code sequence that was generated by using MDCT.
In general, the frequency resolution of a spectrum is increased and energy is concentrated in a particular spectrum component by elongating the transform time block. By performing transform using MDCT in which transform is performed for a long block that is overlapped with each of the adjacent blocks by a half of its length and the number of resulting spectrum signals is not greater than the number of original time-domain samples, coding can be made more efficient than in the case of using DFT or DCT. Inter-block distortion of a waveform signal can be reduced by giving a sufficiently long overlap to adjoining blocks.
Bands where quantization noise occurs can be controlled by quantizing a signal that has been divided into bands by filtering or spectrum conversion in the above-described manner, and more efficient coding can be performed for the human auditory sense by utilizing such features as a masking effect. Even more efficient coding can be performed by, for example, normalizing a signal component of each band by a maximum value of its absolute values before performing quantization.
In quantizing each of frequency components obtained by frequency band division, the frequency division widths may be determined by, for example, taking into consideration the properties of the human auditory sense. That is, an audio signal may be divided into a plurality of bands (e.g., 25 bands) in such a manner that the bandwidth becomes greater as the frequency increases (the highest frequency band is generally called a critical band).
Where band division is performed so as to produce a wide critical band, data in respective bands may be coded in such a manner that prescribed numbers of bits may be assigned to the respective bands or the numbers of bits to be allocated to the respective bands may be determined adaptively.
For example, when coefficient data obtained by MDCT are coded with bit allocation, the numbers of bits to be allocated and to be coded to MDCT coefficient data in respective bands that are obtained by block-by-block MDCT are determined adaptively. For example, the following two bit allocation methods are known.
R. Zelinski, P. Noll, et. al.: “Adaptive Transform Coding of Speech Signals,” IEEE, Transactions of Acoustics, Speech, and Signal Processing, Vol. ASSP-25, No. 4, August 1977 describes a method in which bit allocation is performed on the basis of the signal magnitude in each band. This method can produce a flat quantization noise spectrum and minimize the noise energy. However, since the masking effect is not utilized when the auditory sense is taken into consideration, this method is not an optimum one in terms of reducing noise that can be heard actually by the human ear.
M. A. Kransner (Massachusetts Institute of Technology): “The Critical Band Coder Digital Encoding of the Perceptual Requirements of the Auditory System,” ICASSP 1980 describes a method in which fixed bit allocation is performed by obtaining signal-to-noise ratios necessary for respective bands by utilizing auditory masking. However, because of the fixed bit allocation, characteristic values are not very good even in the case where a characteristic is measured with input of a sine wave.
To solve the above problems, a high-efficiency coding device has been proposed in which all bits that can be used for bit allocation are divided into bits for a fixed bit allocation pattern that is predetermined for each small block and bits for bit allocation depending on the signal size of each block and the division ratio is determined depending on a signal relating to an input signal in such a manner that the bits for a fixed bit allocation pattern is given a larger proportion as that signal has a smoother spectrum.
Where the energy is concentrated in a particular spectrum component as in the case of sine wave input, this method can greatly increase the total signal-to-noise ratio because a large number of bits can be allocated to a block including the particular spectrum component. In general, the human auditory sense is very sensitive to a signal having a steep spectrum component. Therefore, increasing the signal-to-noise ratio by such a method is effective in improving not only measured characteristic values but also the quality of sound actually heard by a person.
Other various bit allocation methods have been proposed. Sophisticated models of the auditory sense and improvement in the ability of coding devices have made it possible to not only obtain better measured characteristic values but also perform higher-efficiency coding for the human auditory sense. In these methods, in general, bit allocation reference values (real numbers) that realize a calculated signal-to-noise-characteristic as faithfully as possible are determined and integers as their approximations are determined and set as the numbers of bits to be allocated.
Japanese Patent Application No. 152865/1993 or WO 94/28633 of the present inventors describes a method in which tone components that are particularly important in terms of the auditory sense, that is, components whose energy is concentrated in the neighborhood of a particular frequency, are separated from generated spectrum signals and coded separately from the other spectrum components. This method makes it possible to code an audio signal or the like effectively at a high compression ratio while causing almost no deterioration for the human auditory sense.
In generating an actual code sequence, quantization accuracy information and normalization coefficient information are coded with a prescribed number of bits in each band where normalization and quantization are performed and then normalized and quantized spectrum signals are coded. ISO/IEC 11172-3 (1993(E), 1993) describes a high-efficiency coding method in which the number of bits for representing quantization accuracy information is set so as to vary from one band to another depending on a band. More specifically, according to this standard, the number of bits for representing quantization accuracy information decreases as the band becomes higher in frequency.
A method is known in which quantization accuracy information is determined based on normalization coefficient information, for example, in a decoding device instead of coding quantization accuracy information directly. However, in this method, the relationship between normalization coefficient information and quantization accuracy information is determined at the time of establishment of a standard, and hence it is impossible to introduce, in the future, a control that employs quantization accuracy that is based on a more advanced auditory model. Further, where compression ratios in a certain range are to be realized, it is necessary to set a relationship between normalization coefficient information and quantization accuracy information for each compression ratio.
A method for performing coding efficiently using variable-length codes that is disclosed in D. A. Huffman: “A Method for Construction of Minimum Redundancy Codes,” Proc. I.R.E., Vol. 40, p. 1,098, 1952, for example, is known as a method for coding quantized spectrum signals more efficiently.
It is also possible to distribute a signal that has been coded by any of the above-described methods by encrypting it in the same manner as in the case of a PCM signal. Where this scrambling method is employed, a person who has not acquired a key signal cannot reproduce an original signal. Another method is known in which a PCM signal is converted into a random signal and then coded for compression instead of encrypting a coded bit sequence. However, where this scrambling method is employed, a person who has not acquired a key signal can reproduce only noise.
The marketing of contents data can be promoted by distributing trial listening data of the contents data. Examples of trial listening data are data that are reproduced with lower sound quality than original data and data that enables reproduction of part (e.g., a climax portion) of original data. If a user likes reproduced trial listening data, he attempts to buy a decryption key to enable reproduction of the original sound or to buy a new recording medium on which the original audio data are recorded.
However, with the above scrambling methods, none of the data cannot be reproduced or all the data are reproduced as noise. Therefore, the above scrambling methods cannot be used for the purpose of distribution, for trial listening, of a recording medium on which sound is recorded with relatively low quality. Even if data that have been scrambled by any of the above methods are distributed to a user, he cannot recognize an outline of the entire data.
In the conventional methods, in encrypting a signal that has been subjected to high-efficiency coding, it is usually very difficult for common reproducing devices to produce a meaningful code sequence while not lowering the compression efficiency. That is, where a code sequence generated by high-efficiency coding is scrambled in the above described manner, only noise is generated if the code sequence is reproduced without descrambling it. If a scrambled code sequence does not comply with the standard of original high-efficiency codes, reproduction processing may not be performed at all.
Conversely, where high-efficiency coding is performed after a PCM signal is scrambled, such coding becomes irreversible if the amount of information is reduced by utilizing the properties of the auditory sense. Therefore, even if such high-efficiency codes are decoded, a scrambled PCM signal cannot be reproduced correctly. That is, it is very difficult to descramble such a signal correctly.
Therefore, conventionally, a method that allows correct descrambling though lowers the compression efficiency is employed.
In view of the above problems, the present inventors proposed, in JP-A-10-135944, an audio coding method in which data obtained by encrypting only codes corresponding to high-frequency bands among codes obtained by converting music data, for example, into spectrum data are distributed as trial listening data so that even a user not having a key can decode and reproduce a non-encrypted, narrow-band signal. In this method, high-frequency-side codes are encrypted, high-frequency-side bit allocation information is replaced by dummy data, and true high-frequency-side bit allocation information is recorded at such positions that a reproduction decoder does not read (i.e., disregards) information during reproduction processing.
This method allows a user to have trial listening data distributed, reproduce those data, buy a chargeable key for decoding trial listening data he likes into original data, and enjoy desired music or the like with high sound quality by reproducing it correctly in all bands.
According to the technique disclosed in JP-A-10-135944, a user not having a key can decode only a narrow-band signal of data that are distributed free of charge. However, the safety relies on only the encryption. Therefore, if the encryption is cracked, a user can reproduce music with high sound quality without paying a charge. The distributor of music data (contents provider) cannot collect a legitimate charge.