In many applications there is a strong interest in identifying and classifying audio signals. One such classification is automatically identifying when an audio signal is a speech audio signal, a music audio signal, or silence. Whereas a human listener can easily discriminate between speech and music audio signals, for example by listening to a short segment for a few seconds, automatic identification or discrimination has been found to be a technically difficult problem.
Such identification of whether the audio signal is music or speech is particularly beneficial in wireless communications system apparatus. Audio signal processing within the apparatus in wireless communication systems can implement different encoding and decoding algorithms to the signal depending on whether or not the signal is speech, music or silence. The type of algorithm used can more optimally address the characteristics of the audio signal in question and thus optimally process the signal so not to lose intelligibility in the speech audio signal, not to significantly degrade the fidelity of the music audio signal and not to use significant network resources in communicating silence.
Automated audio signal classification for speech and music audio signals has been tackled many times previously. Often these approaches require complex analysis using pattern recognition apparatus such as neural nets to attempt to classify whether or not the signal is speech or music. However such processing heavy approaches are not suitable for communications equipment and particularly portable devices where processing capacity carries power consumption and cost penalties.