Data embedding is a form of steganography that is concerned with ways of inserting a given secret message or data in an innocuous cover message, such as an image, video, audio, or computer code. Digital data embedding in audio signals has many applications. These applications include covert communication by securely hiding encoded/encrypted information in audio signals, copyright protection of transmitted audio signals, and embedding information for describing, modifying, and tracking of audio signals. By providing different access levels to the embedded data, the quality of the audio signal and the ability to hear the hidden message can be controlled. Transmission of battlefield information via an auxiliary or cover audio signal could play an essential role in the security and safety of personnel and resources.
Most of the work in data hiding has been concentrated on hiding a small amount of information such as copyright data or a watermark in images and video segments. However, general requirements, challenges and principles of hiding data in an audio are the same as those for embedding information in video. Robustness of the hidden data, for example, is a key requirement for successful embedding and retrieval of the data. In other words, standard signal processing operations, such as noise removal and signal enhancement, must not result in loss or degradation of the embedded information. Additionally, for covert communication, the embedded information must withstand channel noise and intentional attacks or jamming on the signal. Also important in covert communication is the resilience of the hidden information to stay hidden to pirates during their intentional or unintentional attempts at detection. A measure of effectiveness of data embedding is the probability of detection of hidden data. Clearly the more robust the host medium—image, video, or audio—to attacks and common operations, the higher would be its effectiveness.
Additional requirements specific for embedding data in audio signals vary with the applications. In general, the embedded data must be perceptually undetectable or inaudible. While this may not be strictly required or even needed for watermarking of audio for browsers on the Internet, covert communication calls for the hidden message to be truly imperceptible. Tamper resistance of the hidden message, on the other hand, is more crucial in battlefield covert communication than in protecting ownership of the cover audio. Additionally, extraction of the hidden message must not require access to the host (cover) audio. Clearly, lack of the original host signal that was used to embed the message makes it difficult to extract and adjudge the quality and quantity of the hidden data. For covert communication, however, this challenge must be met even at the cost of degraded quality of the message-embedded audio. Other requirements, such as robustness to transmission channel noise, and linear and nonlinear filtering, are also important in hiding data in audio. Security requirements in covert communication dictate that an unauthorized user must not be able to detect the presence of hidden data unless he has the key to the insertion of data. This may require encryption of the data prior to its insertion in the host audio.
Some of the most common techniques for hiding data in images employ the properties of human visual system. The least significant bits of an image may be altered in accordance with the data to be embedded, for example. The technique in this case relies on the low sensitivity of the human visual system to contrast. Variations of this technique include embedding pseudo random noise sequence that appears as quantization noise, and modifying the Discrete Cosine Transform (DCT) or wavelet transform coefficients, etc. for watermarking. Other methods also exploit imperceptible brightness levels to add tags, identification strings, etc. More recently, spread spectrum techniques, in which the watermark to be embedded in an image is spread throughout the spectrum of the image, have been widely considered. For video, blue color has been used to embed watermark based on the least sensitivity of human visual system to modifications in the blue band.
The notion of creating an imperceptible data-embedded image based on the human visual system threshold has been extended by several researchers to embed data in host audio. In general, the procedure exploits the frequency and temporal masking properties of the human auditory system (HAS) to modify the cover audio in such a way that changes due to the embedded data are inaudible. Other methods to watermark a host audio use replacement of spectral components in the high, middle, or other pre-selected frequency bands in accordance with the sequence to be embedded. In addition, several techniques involving the use of spread spectrum noise sequence have been reported. By far the methods employing the psychoacoustical masking properties of HAS in some form appear to better meet the challenges and requirements of audio data embedding.