Covert speech communication is concerned with transmitting vital audio information via an innocuous cover audio in a secure and robust manner. It is an application of the art and science of steganography, or data embedding, that has been increasingly gaining importance in the all-encompassing field of information technology. While cryptography conceals the information contents being transmitted, steganography conceals the existence of covert information in the cover medium, be it audio, image, or video. In encryption, the message audio signal, for instance, is itself altered in such a way that it renders the resulting data unintelligible. Although persons without the encryption key cannot decipher the signal, transmitting encrypted information, in general, arouses suspicion about the presence of hidden information. For battlefield communication, in particular, hiding the existence of information is, therefore, crucial. Using a host medium as a wrapper or carrier in steganography, the covert information is kept intact as opposed to modifying it in cryptography.
Steganography, in general, relies on the imperfection of the human auditory and visual systems. Image and video steganography exploit the low visual sensitivity in perceiving changes in luminance of greater than one in 30 of random patterns, or one in 240 in uniform levels of gray, for example [1]. Audio steganography takes advantage of the psychoacoustical masking phenomenon of the human auditory system (hereinafter, HAS). Psychoacoustical, or auditory, masking is a perceptual property of the HAS in which the presence of a strong tone renders a weaker tone in its temporal or spectral neighborhood imperceptible [2]. This property arises because of the low differential range of the HAS even though the dynamic range covers 80 dB below ambient level [2]. In temporal masking, a faint tone becomes undetected when it appears immediately before or after a strong tone. Frequency masking occurs when human ear cannot perceive frequencies at lower power level if these frequencies are present in the vicinity of tone- or noise-like frequencies at higher level. Additionally, a weak pure tone is masked by wide-band noise if the tone occurs within a critical band. We must note that the masked sound becomes inaudible in the presence of another louder sound; the masked sound, faint as it may be, is still present, however. This property of inaudibility of weaker sounds is used in different ways for embedding information. In the case of embedding in phase or amplitude, for example, the phase or amplitude of a frequency-masked sample in the spectral domain is altered in accordance with information bit to be embedded [3-5]. Instead of modifying the host sample, the present work inserts tones at low power to conceal information.