1. Technical Field
This invention relates generally to digital coding systems. More particularly, this invention relates to classification systems for speech coding.
2. Related Art
Telecommunication systems include both landline and wireless radio systems. Wireless telecommunication systems use radio frequency (RF) communication. Currently, the frequencies available for wireless systems are centered in frequency ranges around 900 MHz and 1900 MHz. The expanding popularity of wireless communication devices, such as cellular telephones is increasing the RF traffic in these frequency ranges. Reduced bandwidth communication would permit more data and voice transmissions in these frequency ranges, enabling the wireless system to allocate resources to a larger number of users.
Wireless systems may transmit digital or analog data. Digital transmission, however, has greater noise immunity and reliability than analog transmission. Digital transmission also provides more compact equipment and the ability to implement sophisticated signal processing functions. In the digital transmission of speech signals, an analog-to-digital converter samples an analog speech waveform. The digitally converted waveform is compressed (encoded) for transmission. The encoded signal is received and decompressed (decoded). After digital-to-analog conversion, the reconstructed speech is played in an earpiece, loudspeaker, or the like.
The analog-to-digital converter uses a large number of bits to represent the analog speech waveform. This larger number of bits creates a relatively large bandwidth. Speech compression reduces the number of bits that represent the speech signal, thus reducing the bandwidth needed for transmission. However, speech compression may result in degradation of the quality of decompressed speech. In general, a higher bit rate results in a higher quality, while a lower bit rate results in a lower quality.
Modern speech compression techniques (coding techniques) produce decompressed speech of relatively high quality at relatively low bit rates. One coding technique attempts to represent the perceptually important features of the speech signal without preserving the actual speech waveform. Another coding technique, a variable-bit rate encoder, varies the degree of speech compression depending on the part of the speech signal being compressed. Typically, perceptually important parts of speech (e.g., voiced speech, plosives, or voiced onsets) are coded with a higher number of bits. Less important parts of speech (e.g., unvoiced parts or silence between words) are coded with a lower number of bits. The resulting average of the varying bit rates can be relatively lower than a fixed bit rate providing decompressed speech of similar quality. These speech compression techniques lower the amount of bandwidth required to digitally transmit a speech signal.
These low bit rate speech coding systems may provide suitable speech quality. However, the coded signal quality typically is unacceptable for music due to the low bit rate typically used by speech codecs for this type of signal. Music may be provided by a service or similar feature for playing music while a party is waiting. A radio, stereo, other electronic equipment, a live performance, and the like also may provide music when in proximity for transmission by a communication system.
If a music signal is to be transmitted, the speech coding system should switch to higher bit rates to accommodate the music signal. However, current speech coding systems do not effectively classify when a music signal is present. Typically, a voice activity detector (VAD) is used to differentiate speech and music from noise. However, a VAD does not effectively differentiate between speech and music. As a result, most music signals are transmitted at lower bit rates or a combination of lower and higher bit rates.
The invention provides a speech coding system with a music classifier that provides a classification of an input or speech signal. The classification may be the input signal is noise, speech, or music. The music classifier analyzes or determines signal properties of the input signal. The music classifier compares the signal properties to thresholds to determine the classification of the input signal.
In one aspect, the speech coding system with a music classifier comprises an encoder disposed to receive an input signal. The encoder provides a bitstream based upon a speech coding of a portion of the input signal. The speech coding has a bit rate. The encoder provides a classification of the input signal. The classification comprises at least music. The encoder adjusts the bit rate in response to the classification of the input signal.
In a method of classifying music in speech coding system, one or more first signal parameters are determined in response to an input signal. The first signal parameters are compared to at least one noise threshold. When the first signal parameters are not beyond the noise threshold, the input signal is classified as noise. When the first signal parameters are beyond the noise threshold, one or more second signal parameters are determined in response to the input signal. The second signal parameters are compared to at least one music threshold. When the second signal parameters are beyond the music threshold, the input signal is classified as speech. When the second signal parameters are not beyond the music threshold, the input signal is classified as music.
Other systems, methods, features and advantages of the invention will be or will become apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims.