It is often desirable to transmit low to medium speed data signals over audio channels, such as telephone, radio and television channels, carrying analog voice and/or music signals. Such data signals may be used to convey, for example, a serial number, the name of a song being played, copyright information, royalty billing codes and virtual reality cues. Such data signals also may be used to identify particular programs and/or program sources. Programs may include television programs, radio programs, laser video disks, tapes, interactive programs and/or games, and or the like; program sources may include program originators, networks, local stations, syndicators, cable companies, and/or the like; and the broadcast of such programs may include the transmission of programs over the air, over a cable, via a satellite, within a household, within a VCR, a disc player, a computer and/or the like.
Such data signals are referred to herein as ancillary codes. When ancillary codes are used to identify programs and/or program sources, these ancillary codes may be detected by program monitoring systems in order to verify the broadcasts of selected programs, by audience metering systems in order to meter the viewing habits of an audience and/or by like systems.
In a program monitoring system that responds to ancillary codes in the program, the ancillary codes, which are inserted into the program signals, are in the form of identification codes that identify the corresponding broadcast programs. When monitoring the broadcast of programs, therefore, the program monitoring system senses the identification codes in order to verify that the encoded programs are broadcast. The program monitoring system also usually determines the geographical regions in which these programs are broadcast, the times at which these programs are broadcast, and the stations, cables and channels over which these programs are broadcast.
In an audience metering system that employs ancillary codes, an ancillary code is typically added to the possible channels to which a receiver may be tuned. When the ancillary code appears at the output of the receiver, the channel tuned by the receiver, as well as program identification codes, if any, are identified. it should be apparent that a unique ancillary code may be added by a program source to some or all of the programs broadcast to households.
When an ancillary code is added to a program signal, it must be done in such a way that the ancillary code is imperceptible to the audience of the program. A variety of techniques have been employed in attempts to attain this imperceptibility.
One popular technique for adding data to an audio channel involves the transmission of data in the under-utilized portions of the frequency spectrum below and/or above the voice band available on a telephone line, such that the data is imperceptible to listeners. Spread spectrum whitening techniques are applied to the data to maintain interference at a low level.
An example of a technique that places the information in the lower frequency region of the voice band is disclosed in U.S. Pat. No. 4,425,661 to Moses et al. Another technique, described in U.S. Pat. No. 4,672,605 to Hustig et al., involves the use of a spread spectrum signal having most of its energy in the higher audio frequency region and above the voice band. Yet another technique, described in U.S. Pat. No. 4,425,642 to Moses et al., involves spread spectrum processing a data throughout the channel spectrum, such that the spectral energy of the data possesses a pseudo random noise characteristic which, when added to the voice channel, causes only an imperceptible increase in white noise.
Although systems such as those described above are typically sufficient for the particular purposes for which they were designed, they suffer certain deficiencies inherent to the use of spread spectrum processing. Specifically, the use of spread spectrum whitening techniques alone results in extremely low data throughput rates on an audio channel, due to the large spreading gain that must be achieved. In addition, although such techniques make limited use of certain "masking" characteristics of the audio signal with which the data is to be transmitted, they do not make full use of such characteristics, as further described below, thereby limiting the processing gain which might otherwise be achieved.
Other techniques for enabling the simultaneous transmission of audio and data in a single channel include (i) using a start pulse created by taking a subband to zero energy level, and then using the following short period of digitized audio as the serial number, and (ii) using subbands to carry a digital message by forcing the subband energy to zero or leaving it at the actual level in order to create "marks" and "spaces" (i.e., "ones" and "zeros"). The primary deficiencies of the former technique include poor noise immunity and the fact that it is not practical in situations in which many bytes of data must be stored and processed. The primary deficiencies of the latter technique also include poor noise immunity, as well as an extremely slow data throughput rate.
Thomas et al., in U.S. Pat. No. 5,425,100, discloses a multi-level encoding system including a plurality of encoders, each associated with a different level in a multi-level broadcast signal distribution system. The disclosure of Thomas et al., in U.S. Pat. No. 5,425,100, is herein incorporated by reference in its entirety.
The commonly used "AMOL" system taught by Haselwood et al., in U.S. Pat. No. 4,025,851, hereby incorporated by reference in its entirety, adds an ancillary data signal, in the form of a source identification code, to selected horizontal lines in the vertical banking interval of a broadcast television signal. Monitoring equipment, which is located in selected regions throughout the United States, verifies that the programs are broadcast by detecting the source identification codes. The monitoring equipment stores, for later retrieval, these detected source identification codes together with the times at which they were detected and the channels on which they were detected.
U.S. Pat. No. 5,243,423, to DeJean et al., hereby incorporated by reference in its entirety, teaches an audience measurement and program monitoring system in which an ancillary signal is transmitted over video lines of the raster of a broadcast television signal. In order to reduce the perceptibility of the ancillary signal, the video lines over which the ancillary signal is transmitted are varied in a pseudo-random sequence. Alternatively, the ancillary signal may be modulated at relatively low modulation levels by converting the ancillary signal to a spread spectrum ancillary signal. The encoded broadcast program is then identified by decoding the ancillary code near a monitored receiver.
The application of digital data compression methodologies to signals has a substantial impact on the usefulness of the encoding methods discussed above. For example, some video compression schemes delete the vertical blanking interval. Accordingly, any ancillary codes injected into the vertical blanking interval may be removed by such compression of the video signals. Digitization may also act to remove spread spectrum ancillary codes and other signals relying on low signal amplitudes for their concealment. Additionally, ancillary codes transmitted in a high frequency portion of a video signal band may be deleted by compression algorithms that "clip" the upper frequencies.
Although adding an ancillary code to the normally visible portion of the active video signal permits the ancillary code to avoid removal by compression schemes in most cases, and although adding the ancillary code at a frequency in the low energy density portion of the video signal increases the likelihood that the ancillary code will be imperceptible even though the ancillary code is added to the active video, under certain conditions the ancillary code may still be perceptible. For example, if the intensity of the luminance that is modulated onto the video (i.e., luminance) carrier, or the intensity of the color that is modulated onto the chrominance subcarrier, is smaller than the ancillary code at the time the ancillary code is modulated onto a frequency between the video carrier and the chrominance subcarrier, the ancillary code will not be masked by the video carrier or the chrominance subcarrier of the video signal. Thus, the ancillary code may have sufficient relative amplitude to be perceived as noise by the audience of the program.
It is known in the art that every audio signal generates a perceptual concealment function which masks audio distortions existing simultaneously with the signal. Accordingly, any distortion, or noise, introduced into the transmission channel if properly distributed or shaped, will be masked by the audio signal itself. Such masking may be partial or complete, leading either to increased quality compared to a system without noise shaping, or to near-perfect signal quality that is equivalent to a signal without noise. In either case, such "masking" occurs as a result of the inability of the human perceptual mechanism to distinguish between two signal components, one belonging to the audio signal and the other belonging to the noise, in the same spectral, temporal or spatial locality. An important effect of this limitation is that the perceptibility of the noise by a listener can be zero, even if the signal-to-noise ratio is at a measurable level. Ideally, the noise level at all points in the audio signal space is exactly at the level of just-noticeable distortion, which limit is typically referred to as the "perceptual entropy envelope."
Hence, the main goal of noise shaping is to minimize the perceptibility of distortions by advantageously shaping it in time or frequency so that as many of its components as possible are masked by the audio signal itself. See Nikil Jayant et al., Signal Compression Based on Models of Human Perception, 81 Proc. of the IEEE 1385 (1993). A schematic representation of time-frequency domain masking is shown in FIGS. 1A-1C, in which a short sinusoidal tone 10 produces a masking threshold 12. See John G. Beerends and Jan A. Stemerdink, A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation, 40 J. Audio Engineering Soc'y 963, 966 (1992).
"Perceptual coding" techniques employing the above-discussed principles are presently used in signal compression and are based on three types of masking: frequency domain, time domain and noise level. The basic principle of frequency domain masking is that when certain strong signals are present in the audio band, other lower level signals, close in frequency to the stronger signals, are masked and not perceived by a listener. Time domain masking is based on the fact that certain types of noise and tones are not perceptible immediately before and after a larger signal transient. Noise masking takes advantage of the fact that a relatively high broadband noise level is not perceptible if it occurs simultaneously with various types of stronger signals.
Perceptual coding forms the basis for precision audio sub-band coding (PASC), as well as other coding techniques used in compressing audio signals for mini-disc (MD) and digital compact cassette (DCC) formats. Specifically, such compression algorithms take advantage of the fact that certain signals in an audio channel will be masked by other stronger signals to remove those masked signals in order to be able to compress the remaining signal into a lower bit-rate channel.
Another deficiency of the prior art techniques for simultaneously transmitting data with audio signals is that if the signals are transmitted through a channel which implements a lossy compression algorithm, such as the MPEG compression algorithm, the data, or at least portions thereof, will likely be removed, as most such compression algorithms divide the audio channel into a plurality of subbands and then encode and transmit only the strongest signal within each subband. Regardless of which of the previously-described techniques is used, it is highly unlikely that the data will ever be the strongest signal in a subband; therefore, it is unlikely that any portion of the data will be transmitted. Moreover, with respect to the spread spectrum techniques, even assuming the data happens to be the strongest signal in one or two subbands, because the information is spread throughout the signal spectrum, the information contained in such subbands will comprise only a small portion of the total information carried by the data and therefore is likely to be useless.
Accordingly, what is needed is a system for simultaneously transmitting ancillary codes and audio signals that utilizes the advantages of perceptual coding techniques and which is capable of transmitting ancillary codes through a lossy compressed channel.