1. Field of the Invention
The present invention relates to a signal processing apparatus, a signal processing method, and a program therefor. More particularly, the present invention relates to a signal processing apparatus, signal processing method, and program suitable for use in the decomposition of an audio signal into its respective pitch components.
2. Description of the Related Art
In the past, a number of musical pitch analysis techniques have been proposed for use in automatic notation, wherein a musical score is automatically generated according to an input audio signal, or for use in the detection of the musical characteristics of an input audio signal.
Musical pitch analysis is a type of processing whereby a digital audio (i.e., musical) signal sampled at a given sampling frequency is analyzed by decomposition into information about each musical pitch C, C#, D, D#, E, F, F#, G, G#, A, A#, and B corresponding to the solfège syllables (do re mi, etc.).
The twelve musical pitches C, C#, D, D#, E, F, F#, G, G#, A, A#, and B constitute a single octave. Hereinafter, octaves are designated octave O1, O2, O3, etc., in order from low (i.e., low-frequency) octaves to high (i.e., high-frequency) octaves. In addition, the pitch C of octave O1 is designated C1, while the pitch A# of octave O2 is designated A#2, for example.
The pitches of a given octave are related to the pitches of lower octaves in that the pitches of the given octave are multiples of the pitches of the lower octaves. In other words, pitches are distributed logarithmically (or exponentially) with respect to frequency. For example, if the pitch A3 (being the pitch A of the octave O3) is taken to have a frequency (i.e., a center frequency) of 440 Hz, then the pitch A4 (being the pitch A of the octave O4) has a frequency that is double 440 Hz, and thus 880 Hz. Furthermore, the difference in frequency (i.e., center frequency) between adjacent pitches such as C and C# increases with higher octaves. For example, in the low octave O2 (127.1 Hz to 254.2 Hz), the difference between C2 and C#2 is approximately 6 Hz, while in the high octave O6, the difference between C6 and C#6 is approximately 123 Hz.
Moreover, the respective frequency bands (i.e., the bandwidths) for each pitch in a given octave are twice that of the frequency bands for the corresponding pitches in the next lower octave.
Established techniques for musical pitch analysis of audio signals include techniques using the short-time Fourier transform (hereinafter referred to as STFT techniques) as well as techniques using wavelet transforms (hereinafter referred to as wavelet transform techniques). In addition, there also exist techniques like that proposed in the present application, which use octave division and band pass filtering (hereinafter referred to as octave division techniques). (See JP-A-2005-275068, for example).
STFT techniques analyze the frequency components of an audio signal using equally-spaced frequency bands. For this reason, there is a tendency for the analysis to be less precise at low frequencies, due to the fact that pitches are distributed logarithmically with respect to frequency, as described above.
With wavelet transform techniques, it is possible to estimate pitch with an ideal time resolution and frequency resolution by using a basis function able to extract one-twelfth of an octave (i.e., a single musical pitch). However, wavelet transform techniques involve computation of vast complexity.
In contrast, with octave division techniques, it is possible to conduct musical pitch analysis without reduced precision at low frequencies, and furthermore with reduced computational complexity compared to that of wavelet transform techniques.