With the increasing distribution of the Internet, music piracy has also drastically increased. At many locations on the Internet, of music or, in general, audio signals can be downloaded. Copyrights are only considered in very few cases. Particularly, the authorisation of the author is very rarely obtained as to whether he wants to offer his work or not. Fees occurring are rarely paid to the author for lawful copying. Apart from that, an uncontrolled copying of works takes place which, in most cases, also happens without consideration of copyrights.
When music is lawfully purchased from a provider of music via the Internet, the provider usually produces a header in which copyright information as well as, for example, a customer ID are introduced, the customer ID uniquely referring to the present purchaser. It is further known to introduce copy allowance information into that header, which signal the diverse types of copyrights, for example, that the copying of the current piece is completely forbidden, that the copying of the current piece is only allowed once, that the copying of the current piece is totally free, etc.
The customer has a decoder that reads in the header, and that, in compliance with the allowed actions, for example, only allows one copy and refuses further copies.
This concept for consideration of copyrights, however, only works for customers who behave legally.
Illegal customers usually have a significant potential of creativity to “crack” pieces of music that are provided with a header. The disadvantage of the described procedure for the protection of copyrights is shown here. Such a header can be removed easily. Alternatively, an illegal user could also modify individual entries in the header, for example, to change the entry “copying forbidden” to an entry “copying totally free”. It is also a possible case that an illegal customer removes his own customer ID from the header and then offers the piece of music on his or another Homepage in the Internet. From that moment onwards, it is no longer possible to identify the illegal customer, since he has removed his customer ID. Attempts to prevent such violations of the copyright will, therefore, inevitably be useless, since the copy information has been removed from the piece of music or has been modified and, since the illegal customer who has done that, cannot be identified anymore to call him to account. If, instead, a secure introduction of information into the audio signal were existent, then government authorities who prosecute copyright violations could trace suspicious pieces of music in the Internet and, for example, could establish the user identification of such illegal pieces in order to put a stop to the illegal users.
From WO 97/33391, an encoding method for introducing an inaudible data signal into an audio signal is known. There, the audio signal into which the inaudible data signal is to be introduced is converted into the frequency area in order to determine the masking threshold of the audio signal using a psychoacoustic model. The data signal to be introduced into the audio signal is multiplied with a pseudo noise signal in order to create a frequency-spread data signal. The frequency-spread data signal is then weighted with a psychoacoustic masking threshold, such that the energy of the frequency-spread data signal will always be below the masking threshold. Finally, the weighted data signal is superimposed on the audio signal, whereby an audio signal is created in which the data signal is inaudibly introduced. On the one hand, the data signal can be used to establish the range of a transmitter. On the other hand, the data signal can be used for the identification of audio signals in order to easily identify possible pirate copies, since every sound carrier, for example, a compact disc, is provided with an individual identification ex works. Further described possibilities for the application of the data signal is the remote control of audio devices, analogous to the “VPS” method on television.
This method is highly secured against music pirates, since; on the one hand, they are probably not aware that the piece of music that they are copying is identified. Apart from that, it is almost impossible to extract the data signal, which is inaudibly present in the audio signal without an authorised decoder.
Audio signals are 16 bit PCM samples, when they come from a compact disc. A music pirate could, for example, manipulate the sampling rate or the levels or phases of samples to make the data signal unreadable, i.e., undecodable, whereby the copyright information would also be removed from the audio signal. This, however, will not be possible without significant quality losses. Data that are introduced into audio signals in such a way can therefore, analogous to bank notes, also be referred to as “watermarks”.
The method described in WO 97/33391 for introducing an inaudible data signal into an audio signal works by using the audio samples that are present as time domain samples. Thereby, it is necessary that audio pieces, i.e., pieces of music, radio plays, etc., have to be present as a sequence of timely samples in order to be provided with a watermark. This has the disadvantage that this method cannot be used for already-compressed data streams that have been processed, for example, according to one of the MPEG methods. This means that a provider of pieces of music who wants to provide the pieces of music with a watermark prior to shipment to the customer has to store the pieces of music as a sequence of PCM samples. This leads to the provider for music needing to have a very high storage capacity. However, it would be desirable to use the very-effective audio compressing method already for storing the audio data at the provider.
A provider for audio data of the above-described type could, of course, simply compress all pieces of music, for example, by using the standards MPEG-2 AAC 13818-7 and then decompress them fully again before the audio piece is to be provided with a watermark, in order to have a sequence of audio samples again that will then be fed into a known apparatus for introducing an inaudible data signal in order to introduce a watermark. This needs a significant effort in that prior to the introduction of information into the audio signal, a full decompression or decoding is necessary. Such a decoding costs time and money. However, a much more serious feature is the fact that in such a procedure, tandem encoding effects occur.
A further disadvantage of this procedure is that due to the fact that the watermark is introduced into the PCM data, there is no security as to whether the watermark is still present after an audio compression. When PCM data provided with watermarks and having a relatively low bit rate and are encoded, the encoder introduces a lot of quantizing noise when quantizing due to the relatively low bit rate, which will, in an extreme case, lead to the fact that no watermark can be decoded anymore. It is also problematic that with this procedure, the bit rate of the audio encoder that encodes the PCM data provided with watermarks is not known previously and that is why no secure control of the ratio between watermark energy and noise energy due to the quantizing noise is possible.
It is known that audio encoding methods according to one of the MPEG standards are no loss-less encoding methods, but lossy encoding methods. Bit savings in comparison to direct transmission of audio samples in the time domain are achieved, to a large part, by making use of psychoacoustic masking effects. Particularly, for a block of, for example, 2048 audio samples, the psychoacoustic masking threshold will be established as a function of frequency, whereupon, after a time frequency transformation of the audio samples the quantizing of spectral values including the short-term spectrum will be carried out under consideration of this psychoacoustic masking threshold. In other words, the quantizer step size is controlled, such that the noise energy introduced by quantizing is smaller or equal to the psychoacoustic masking threshold. In areas of the audio signal where the masking index, i.e., the ratio of audio signal energy to the psychoacoustic masking threshold is very small, like, for example, in very noisy areas of the audio signal, the spectral values need to be only roughly quantized, without audible interferences occurring after a sub-sequent decoding. In other areas where the audio signal is very tonal, it has to be quantized more finely, such that relatively small noise energy results due to the quantizing, since the masking index is very large.
It becomes clear from the above that due to the quantizing procedure, information of the original audio signal gets lost. This does not matter when the quantized audio signal is decoded again, since the noise energy due to the quantizing has been distributed in such a way that it remains below the psychoacoustic masking threshold and will, therefore, be inaudible when an ideal psychoacoustic model has been used. These considerations, however, always only apply for a certain short-term spectrum or for a block of, for example, 2048 subsequent audio values, respectively. After the decoding, the block of audio samples does, however, comprise no more information about how the block building was performed. When the known apparatus for introducing information has been used which, in most cases, has a certain delay compared to an audio encoder that does not introduce information, it can therefore not be assumed that the same block partitioning takes place accidentally. Instead, the block partitioning, the short-term spectrum creation and the quantizing will take place in a totally different block raster. A renewed decoding will then usually lead to clearly audible interferences, since it does not refer to the same short-term spectrum, but to different short-term spectrums. This appearance of audible interferences through two encoding/decoding stages due to their different partitioning of the stream of audio samples into blocks is referred to as tandem encoding effect.
It should be noted that in general by introducing the inaudible data signal, noise energy is introduced into the audio signal, which already includes noise energy due to the uninfinitely fine quantizing procedure. Introducing the inaudible data signal therefore has a tendency to lead to a deterioration of the audio quality unless special precautions will be taken. In this connection, a further introduction of noise energy due to the tandem encoding effects previously described is therefore even less desirable, since this quality loss appears systematically without any benefit, while small quality deteriorations due to the watermarks are more acceptable, since the watermark also has an advantage. Tandem encoding effects, however, only cause interferences, but have no advantage at all.
U.S. Pat. No. 5,687,191 discloses a concept for transmitting hidden data after data compression. An audio signal is transferred into sub-band samples via a sub-band encoder, wherein each sub-band filter generates a sequence of timely samples whose spectral bandwidth is the same as the bandwidth of the respective sub-band filter. A data stream with such quantized sub-band samples will be unpacked and demultiplexed in order to perform an inverse quantizing, such that sub-band samples will be present again. Further, a pseudo noise spread sequence is filtered by a sub-band filter bank to obtain a sequence of timely sub-band samples for every filter of the sub-band filter bank having a bandwidth determined by the respective sub-band filter. The data to be transported will be subjected to a forward error correction and a performance control securing that the auxiliary data signal is below the noise quantizing floor of the audio sub-band samples. The so processed auxiliary data values will then be connected with respective sub-band values of the pseudo noise spread sequence via respective modulators and then XORed with the unpacked sub-band values of the audio signal. The so obtained combined sub-band values will then be quantized again and packed, in order to obtain an output data stream.