1. Field of Art
The present invention generally relates to the field of digital audio, and more specifically, to ways of accurately extracting discrete notes from a continuous signal.
2. Description of the Related Art
A prerequisite for audio analysis is the conversion of portions of an audio signal (e.g., a song) into representations of their notes or “chromae,” i.e., a set of frequencies of interest, along with magnitudes quantifying the relative strengths of the frequencies. For example, a portion of an audio signal could be converted into a representation of the 12 semitones in an octave. The conversion of an audio signal portion into its chromae enables more meaningful analysis of the audio signal than would be possible using the signal data alone.
Conventional techniques for extracting the chromae from an audio signal typically use a Discrete Fourier Transform (DFT) of the audio signal to produce a set of frequencies whose wavelengths are an integer fraction of the signal length and then map the frequencies of the DFT to the frequencies of the chromae of interest. Such a technique suffers from several shortcomings. First, the frequencies used in the DFT typically do not match the frequencies of the desired chromae, which leads to a “smearing” of the extracted chromae when they are mapped from the frequencies used by the DFT to the frequencies of the chromae, especially for sounds in lower frequencies. Second, computing the DFT for short portions of the audio signal requires dampening the signal at the beginning and end of the audio sample, a process called “windowing”, to avoid artifacts caused by the non-periodicity of the audio sample. The windowing process further reduces the quality of the extracted chromae. As a result of the smearing and smoothing operations of the DFT, the values in the chromae lose accuracy. Analyses that use the chromae therefore suffer from diminished accuracy.