Watermarking is a technique for embedding a cryptographic signature into digital content for the purposes of detecting copying or alteration of the content. This is accomplished using coding techniques that hide data within the image or audio content in a manner not normally detectable. Thus, embedding an imperceptible, cryptographically secure signal, or watermark, is seen as a mechanism that may be used to prove ownership or detect tampering.
The technique of embedding a digital signal into an audio recording or image using techniques that render the signal imperceptible has received significant attention. For example, with respect to audio watermarking, U.S. Pat. No. 5,319,735 to Preuss et al. entitled “Embedding Signaling,” the disclosure of which is incorporated by reference herein, discloses a digital information hiding technique for audio using the techniques of spread spectrum modulation. Further, L. Boeny et al., “Digital watermarks for audio signals,” Proc. of Multimedia 1996, Hiroshima, 1996, the disclosure of which is incorporated by reference herein, discloses making explicit use of the MPEG-1 Psychoacoustic Model to obtain frequency masking values to achieve good imperceptibility. Recently, in R. J. Ruiz et al., “Digital watermarking of speech signals for the national gallery of the spoken word,” ICASSP, Turkey, 2000, the disclosure of which is incorporated by reference herein, a speech watermarking method for application to digital speech libraries has been proposed. These methods have been extensively applied for music applications, but embed information over a very wide audio band based on human hearing capabilities. However, a potential attacker need only low-pass filter the resulting signal to remove most of the watermarking information.
While there has been a considerable amount of attention devoted to the techniques of spread-spectrum signaling for use in image and audio watermarking applications, there has only been a limited study for embedding data signals in speech, e.g., the above-mentioned R. J. Ruiz et al. reference. Speech is an uncharacteristically narrow band signal given the perceptual capabilities of the human hearing system. Speech differs from music in its acoustic characteristics and watermarking requirements. Speech is an acoustically rich signal that uses only a small portion of the human perceptual range. Typical speech reproduction hardware, although often the same as used with music, includes much lower bit rate channels such as telephone or compressed voice “vocoders.”
Therefore, it would be highly advantageous to provide watermarking techniques for encoding a digital message into a speech signal such that the resulting watermarked signal is robust to speech channels.