In the following different watermarking systems shall be reviewed in short.
A watermarking system can be viewed as a communication system. Let the bit-wise information to be transmitted be represented by a watermark signal “wm”, which is the desired signal. The signal wm is ‘embedded’ into a host signal “a” by adding the two signals (the watermark signal wm and the host signal a), obtaining a watermarked signal “awm”. With respect to the watermark, the host signal can be seen as an additive distortion. This means that awm deviates from its ideal value wm, corrupting the decoding process (if the original host signal a is not known at the decoder). The signal awm is further affected by a transmission channel, in that the channel introduces distortions. Examples of transmission channels are the compression of the signal awm with an audio codec such as AAC as well as the playback of the signal awm with a loudspeaker, its propagation in air, and its pickup with a microphone.
A characteristic of watermark systems is that one part of the distortion, namely the host signal, is known at the transmitter. If this information is exploited during embedding the method is called informed embedding or watermarking with side information (see also Ingemar J. Cox, Ed., Digital watermarking and steganography, The Morgan Kaufmann series in multimedia information and systems. Morgan Kaufmann, Burlington, 2. ed. edition, 2008). In principle, weighting the watermark wm according to power levels given by a perceptual model is already a case of informed embedding. However, this information is used merely to scale the watermark in order to make it imperceptible whereas the host signal is still seen as an unknown noise source for the generation of the watermark prior to the weighting. In certain cases, it is possible to create the watermark signal in a way that compensates for the host signal induced distortion so that only channel-induced distortion corrupts the decoding. Such methods are called host-interference rejecting methods (see also Chen and Worrell, “Quantization index modulation: A class of provably good methods for digital watermarking and information embedding,” IEEE TRANSACTION ON INFORMATION THEORY, May 2001, vol. VOL. 47).
In EP non pre-published 10154964.0-1224 differential encoding has been introduced in combination with BPSK- (binary phase shift keying) signaling to obtain a system which is robust with respect to movement of the decoding device (for example if the signal is picked up by a microphone), potential frequency mismatch between the local oscillators in the transmit (Tx) and receive (Rx) sides and potential phase rotations introduced by a frequency selective channel, such as the propagation in the reverberant environment.
The robustness comes from the fact that the information is coded in the phase difference between two adjacent symbols, so that the system is virtually unaffected by a slowly drifting phase rotation of the modulation constellation.
Although the method described in EP 10154964.0-1224 uses information about the host signal a by scaling the watermark signal wm in order to make it imperceptible, the host signal a is still an additional source of unknown noise from the communication system's perspective. In other words, the watermark signal wm (prior to the perceptually motivated scaling) is generated regardless of any knowledge of the host signal a.
Several watermarking systems use some kind of informed embedding method but only a few belong to the group of host-interference rejecting methods. Examples of these are low-bit modulation (LBM) (Mitchell D. Swanson; Bin Zhu; Ahmed H. Tewfik, “Data hiding for video-in-video,” IEEE International Conference on Image Processing, 1997, vol. 2, pp. 676-679; Brian Chen and Gregory W. Women, “Quantization index modulation methods for digital watermarking and information embedding of multimedia,” Journal of VLSI Signal Processing, vol. 27, pp. 7-33, 2001) and quantization index modulation (QIM) that was introduced in (Chen and Wornell, “Quantization index modulation: A class of provably good methods for digital watermarking and information embedding,” IEEE TRANSACTION ON INFORMATION THEORY, May 2001, vol. VOL. 47, and Brian Chen and Gregory Wornell, “System, method, and product for information embedding using an ensemble of non-intersecting embedding generators,” 1999, WO99/60514A).
In QIM, it is first necessitated to choose one or more parameters of a signal representation, e.g., the complex coefficients of a time-frequency representation. The parameters chosen are then quantized according to information to be embedded. In fact, each information-carrying symbol is linked with a certain quantizer; alternatively a whole message is linked with a sequence of quantizers. Depending on the information to be transmitted, the signal is quantized with the quantizer or sequence of quantizers associated with the information. For instance, if the parameter to be quantized was a positive real number, the quantizer to be used to embed a 0 could be defined by the quantization steps 0, 2, 4, 6, . . . whereas the quantizer for a 1 could be 1, 3, 5, . . . . If the current value of the host signal was 4.6 the embedder would change the value to 4 in case of a bit 0 and to 5 in case of a 1. At the receiver, the distance between the received signal representation and all possible quantized representations is calculated. The decision is made according to the minimum distance. In other words, the receiver attempts to identify which of the available quantizers has been used. By doing so, host-interference rejection can be achieved.
Of course, quantizing certain signal parameters may introduce perceivable distortion to the host signal. In order to prevent this the quantization error may be partly added back to the signal which is referred to as distortion-compensated QIM (DC-QIM) (see also Antonius Kalker, “Quantization index modulation (QIM) digital watermarking of multimedia signals,” 2001, WO03/053064). This is an additional source of distortion at the receiver. Although it has been shown that DC-QIM is optimal for the AWGN (additive white Gaussian noise) channel and regular QIM is near-optimal (see also Chen and Wornell, “Quantization index modulation: A class of provably good methods for digital watermarking and information embedding,” IEEE TRANSACTION ON INFORMATION THEORY, May 2001, vol. VOL. 47) the methods have certain drawbacks. They allow for high bit rates but are especially sensitive to amplitude scaling attacks (see also Fabricio Ourique; Vinicius Licks; Ramiro Jordan; Fernando Perez-Gonzalez, “Angle qim: A novel watermark embedding scheme robust against amplitude scaling distortions,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2005).
Another method (derived from QIM) is named Angle QIM (AQIM) and was proposed in the article of Fabricio Ourique; Vinicius Licks; Ramiro Jordan; Fernando Perez-Gonzalez, “Angle qim: A novel watermark embedding scheme robust against amplitude scaling distortions,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2005. There the information is embedded via the quantized angular coordinates. By doing so, robustness against amplitude scaling can be achieved. This method does not provide differential modulation and is therefore not robust against phase drift.
Other watermarking systems exist where the information is embedded into the phase of the audio signal. The methods presented in the article of W. Bender, D. Gruhl, N. Morimoto, and Aiguo Lu, “Techniques for data hiding,” IBM Syst. J., vol. 35, no. 3-4, pp. 313-336, 1996 and S. Kuo, J. D. Johnston, W. Turin, and S. R. Quackenbush, “Covert audio watermarking using perceptually tuned signal independent multiband phase modulation,” IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), 2002, vol. 2, pp. 1753-1756 are non-blind methods and therefore limited to only a small number of applications. In the article of Michael Arnold, Peter G. Baum, and Walter Voesing, “A phase modulation audio watermarking technique,” pp. 102-116, 2009, a blind phase modulation audio watermarking technique is proposed which is called Adaptive Spread Phase Modulation (ASPM). Additionally, these phase modulation methods do not have the host-interference rejection property and do not take the differential coding into account.
Many more watermarking methods exist, including spread spectrum or echo-hiding methods. But as already stated in EP 10154964.0-1224 these methods may not be applicable to certain tasks of interest, e.g. transmitting a watermark over an acoustic path in a reverberant environment.