Analog signals and digital bit stream signals that carry content such as voice, picture, and facsimile patterns may use electric currents, electromagnetic radiation (radio and light waves), sound waves, and other transmission and storage means as carriers for the content. A telephone system, for example, may use numerous carriers in a single connection as a sender's voice signal travels through telephone lines, fiber optic cables, cell phone transmission antennae, and sound speakers. Regardless of the carrier, certain intervals of the signal may represent content, while other intervals or characteristics of the signal may represent nothing more than the presence of the carrier with no content included or superimposed. At times it is beneficial to separate the parts of a signal containing content from the parts of a signal lacking content.
Voice activity detection (VAD) and data compression are examples of techniques that depend upon separating the content part(s) of a signal from the non-content parts. Speakerphone and cell phone systems use VAD to switch signal transmission on and off depending on the presence of voice activity or the direction of speech flow. VAD may also be used in microphones and digital recorders for dictation and transcription, in noise suppression systems, as well as in speech synthesizers, speech-enabled applications, and speech recognition products. VAD may be used to save data storage space and transmission bandwidth by preventing the recording and transmission of undesirable signals or digital bit streams that do not contain voice activity.
VAD usually relies on measurements of one or more attributes of a signal to estimate when voice activity is present in an interval of the signal. For example, the energy level is an attribute of a signal that may be measured using the root mean square voltage levels of the signal to estimate which intervals of the signal contain voice activity. The same energy level measurements may be used in different ways to estimate the presence of voice activity. U.S. Pat. No. 6,249,757 to Cason, for example, is directed to a VAD system that uses two signal filters to provide the difference between a noise floor and the total energy in a communications signal. The signal is partitioned into frames for spectral analysis. Voice activity is detected if the difference between the noise floor and the total energy exceeds a threshold. U.S. Pat. No. 6,023,674 to Mekuria is directed to a periodicity detector that extracts pitch frequencies from a signal and determines speech pitch tracks using a non-linear signal processing block.
There are numerous ways to estimate the presence of voice activity in a signal using measurements of the energy and/or other attributes of the signal. Energy level estimation, zero-crossing estimation, and echo canceling are known methods to estimate or to assist in estimating the presence of voice activity in a signal. Tone analysis by a tone detection mechanism (DTMF) may be used to assist in estimating the presence of voice activity by ruling out DTMF tones that create false VAD detections. Signal slope analysis, signal mean variance analysis, correlation coefficient analysis, pure spectral analysis, and other methods may also be used to estimate voice activity. Each VAD method has disadvantages for detecting voice activity depending on the application in which it is implemented and the signal being processed.
Data compression is another technique that relies upon detection of signal content. Data compression is increasingly used to minimize the number of bits needed to store or transmit digital data. For example, JPEG and MPEG standards for the digital representation of images and movies allow a wide variety of data compression schemes to represent empty or repetitive parts of a picture with a compact marker. This typically saves a large percentage of the storage space or transmission bandwidth that an uncompressed image would have required.
Although detecting intervals of voice activity in a carrier signal using VAD and detecting compressible parts of a signal for data compression, such as Silence Compressed Record, are two examples of applications that use signal content detection, there are many other applications in which the present invention could be used, for example distinguishing communication patterns in random radio waves, searching for patterns in random data, and synchronizing communication between computing devices.