In many technical applications, it is desired to include an extra information into an information or signal representing useful data or “main data” like, for example, an audio signal, a video signal, graphics, a measurement quantity and so on. In many cases, it is desired to include the extra information such that the extra information is bound to the main data (for example, audio data, video data, still image data, measurement data, text data, and so on) in a way that it is not perceivable by a user of said data. Also, in some cases it is desirable to include the extra data such that the extra data are not easily removable from the main data (e.g. audio data, video data, still image data, measurement data, and so on).
This is particularly true in applications in which it is desirable to implement a digital rights management. However, it is sometimes simply desired to add substantially unperceivable side information to the useful data. For example, in some cases it is desirable to add side information to audio data, such that the side information provides an information about the source of the audio data, the content of the audio data, rights related to the audio data and so on.
For embedding extra data into useful data or “main data”, a concept called “watermarking” may be used. Watermarking concepts have been discussed in the literature for many different kinds of useful data, like audio data, still image data, video data, text data, and so on.
In the following, some references will be given in which watermarking concepts are discussed. However, the reader's attention is also drawn to the wide field of textbook literature and publications related to the watermarking for further details.
DE 196 40 814 C2 describes a coding method for introducing a non-audible data signal into an audio signal and a method for decoding a data signal, which is included in an audio signal in a non-audible form. The coding method for introducing a non-audible data signal into an audio signal comprises converting the audio signal into the spectral domain. The coding method also comprises determining the masking threshold of the audio signal and the provision of a pseudo noise signal. The coding method also comprises providing the data signal and multiplying the pseudo noise signal with the data signal, in order to obtain a frequency-spread data signal. The coding method also comprises weighting the spread data signal with the masking threshold and overlapping the audio signal and the weighted data signal.
In addition, WO 93/07689 describes a method and apparatus for automatically identifying a program broadcast by a radio station or by a television channel, or recorded on a medium, by adding an inaudible encoded message to the sound signal of the program, the message identifying the broadcasting channel or station, the program and/or the exact date. In an embodiment discussed in said document, the sound signal is transmitted via an analog-to-digital converter to a data processor enabling frequency components to be split up, and enabling the energy in some of the frequency components to be altered in a predetermined manner to form an encoded identification message. The output from the data processor is connected by a digital-to-analog converter to an audio output for broadcasting or recording the sound signal. In another embodiment discussed in said document, an analog bandpass is employed to separate a band of frequencies from the sound signal so that energy in the separated band may be thus altered to encode the sound signal.
U.S. Pat. No. 5,450,490 describes apparatus and methods for including a code having at least one code frequency component in an audio signal. The abilities of various frequency components in the audio signal to mask the code frequency component to human hearing are evaluated and based on these evaluations an amplitude is assigned to the code frequency component. Methods and apparatus for detecting a code in an encoded audio signal are also described. A code frequency component in the encoded audio signal is detected based on an expected code amplitude or on a noise amplitude within a range of audio frequencies including the frequency of the code component.
WO 94/11989 describes a method and apparatus for encoding/decoding broadcast or recorded segments and monitoring audience exposure thereto. Methods and apparatus for encoding and decoding information in broadcasts or recorded segment signals are described. In an embodiment described in the document, an audience monitoring system encodes identification information in the audio signal portion of a broadcast or a recorded segment using spread spectrum encoding. The monitoring device receives an acoustically reproduced version of the broadcast or recorded signal via a microphone, decodes the identification information from the audio signal portion despite significant ambient noise and stores this information, automatically providing a diary for the audience member, which is later uploaded to a centralized facility. A separate monitoring device decodes additional information from the broadcast signal, which is matched with the audience diary information at the central facility. This monitor may simultaneously send data to the centralized facility using a dial-up telephone line, and receives data from the centralized facility through a signal encoded using a spread spectrum technique and modulated with a broadcast signal from a third party.
WO 95/27349 describes apparatus and methods for including codes in audio signals and decoding. An apparatus and methods for including a code having at least one code frequency component in an audio signal are described. The abilities of various frequency components in the audio signal to mask the code frequency component to human hearing are evaluated, and based on these evaluations, an amplitude is assigned to the code frequency components. Methods and apparatus for detecting a code in an encoded audio signal are also described. A code frequency component in the encoded audio signal is detected based on an expected code amplitude or on a noise amplitude within a range of audio frequencies including the frequency of the code component.
However, when inserting the watermark information into a time/frequency spectrogram of an audio signal, it is difficult to hide the watermark information below the masking threshold or to find an optimal tradeoff between the assignment of as much energy as possible to the watermark information—thus increasing the extractability at the decoder side—, and keeping the watermark information being embedded inaudible when reproducing the watermarked audio signal.