When making acoustic recordings, often multiple sound sources are present simultaneously. These can be different speech signals, noise (e.g. of fans) or similar signals. For further analysis of the signals it is useful to separate these interfering signals. Separation of signals can be used, for example, for speech recognition or acoustic scene analysis. Harmonic signals can be separated in the human auditory system based on their fundamental frequency. See A. Bregman. Auditory Scene Analysis. MIT Press, 1990, which is incorporated by reference herein in its entirety. Note that speech in general contains many voiced and hence harmonic segments.
In conventional approaches the input signal is split into different frequency bands via band-pass filters and in a later stage, for each band at each instant in time, an evidence value for this band to originate from a given fundamental frequency is calculated, where a simple unitary decision can be interpreted as using binary evidence values. By doing so a three dimensional description of the signal is obtained with the following axes: fundamental frequency, frequency band, and time. A similar kind of representation is also found in the human auditory system. See G. Langner, H. Schulze, M. Sams, and P. Heil, The topographic representation of periodicity pitch in the auditory cortex, Proc. of the NATO Adv. Study Inst. on Comp. Hearing, pages 91-97, 1998, which is incorporated by reference herein in its entirety.
Based on these beforehand calculated evidence values, groups of bands with common fundamental frequency can be formed. Hence in each group the harmonics emanating from one fundamental frequency and therefore belonging to one sound source are present. By this means the separation of the sound sources can be accomplished.
One problem with conventional approaches is that calculation of an evidence value that a harmonic originates from a given fundamental is especially difficult if the frequency of the harmonic under investigation is high compared to the sampling frequency. If the bandwidth of the band-pass filters used to analyze a signal are chosen such that for high frequencies two or more harmonics fall into one band this filter band shows an amplitude modulation with half the fundamental frequency underlying the harmonics. This effect is also known as unresolved harmonics. See H. Helmholtz, Die Lehre von den Tonempfindungen, Vieweg, Braunschweig, 1863, which is incorporated by reference herein in its entirety.
For low frequencies it is less practicable to design the bandwidth of the filters wide enough to contain at least two harmonics due to the resulting wide bandwidth relative to the center frequency. Hence, under conventional approaches, for low frequencies a different procedure has to be chosen as for high frequencies. Therefore, one problem with conventional approaches is how to combine the results of these two procedures.
FIG. 1 shows a known approach of separating frequency bands, wherein low frequency and high frequency evidence value procedures are applied to the bands based on a threshold frequency fT. This approach chooses the results from one procedure 4 for all bands below a given frequency fT and take those of the other procedure 5 for all remaining bands. See G. Hu and D. Wang, Monaural speech segregation based on pitch tracking and amplitude. IEEE Trans. On Neural Networks, 2004, which is incorporated by reference herein in its entirety.
What is needed is a more efficient method for separating signal sources, such as acoustic sounds, in an input signal. What is further needed is a way to apply a similar evidence value calculation procedure to both resolved and unresolved harmonics.