In most audio classification schemes, classification of music genres is performed in two steps. First, features are extracted from an audio signal and then a generic classification method is formed based on the extracted features of several different sets of audio clips associated with respective genres. The classification method then learns feature combinations that correspond to a respective genre. Diverse methods can use differing selections of feature sets with varying results.
Using a general classification method can have disadvantages. It can be hard to understand from the classification of various features which individual feature or set of features of audio signals actually correspond to a genre. In addition, general classification methods can be dependent on training data and it can be difficult to predict how the classifier will perform using new training data. Another potential disadvantage is that a general classification method can sometimes fail to use certain features that would prove to be more successful if a more specialized solution related to a singular genre were used.
One area where audio classification can be applied is within an audio matching system. Audio matching provides for identification of a recorded audio sample by comparing an audio sample to a set of reference samples. To make the comparison, an audio sample can be transformed to a time-frequency representation of the sample (e.g., by using a short time Fourier transform (STFT)). Using a time-frequency representation, interest points that characterize time and frequency locations of peaks or other distinct patterns of a spectrogram can be extracted from an audio sample. Descriptors can be computed as functions of sets of interest points. Descriptors of the audio sample can be compared to descriptors of reference samples to determine identity of the audio sample.
Certain genres of music can create distinct problems within an audio matching environment. Techno music, or music generated electronically that contains mostly beat can sometimes make audio matching difficult. Techno songs are not really melodic and may match other non-melodic audio signals if both the techno song and the reference sample happen to match with the features used for melody detection.