Exemplary well-known techniques for separating a desired sound signal (hereinafter referred to as “target signal”) from a mixed input signal containing one or more sound signals and noises include spectrum subtraction method and method with comb filters. In the former, however, only steady noises can be separated from the mixed signal. In the latter, the method is only applicable to target signal in steady state of which fundamental frequency does not change. So these methods are hard to be applied to real applications.
Other known method for separating target signals is as follows: first a mixed input signal is multiplied by a window function and is applied with discrete Fourier transform to get spectrum. And local peaks are extracted from the spectrum and plotted on a frequency to time (f-t) map. On the assumption that those local peaks are candidate points which are to compose the frequency component of the target signal (hereinafter referred to as “frequency component candidate point”), those local peaks are connected toward the time direction to regenerate frequency spectrum of the target signal. More specifically, a local peak at a certain time is first compared with another local peak at next time on the f-t map. Then these two points are connected if the continuity is observed between the two local peaks in terms of frequency, power and/or sound source direction to regenerate the target signal.
According to the methods, it is difficult to determine the continuity of the two local peaks in the time direction. In particular, when the signal to noise (S/N) ratio is high, the local peaks of the target signal and the local peaks of the noise or other signal would be located very closely. So the problem gets worse because there are many possible connections between the candidate local peaks under such condition.
Furthermore, amplitude spectrum extends in a hill-like shape (leakage) because of the influences by integral within a finite time range and time variation of the frequency and/or amplitude. In conventional signal analysis, frequencies and amplitudes of local peaks in the amplitude spectrum are determined as frequencies and amplitudes of the target signal in the mixed input signal. So accurate frequencies and amplitudes could not be obtained in the method. And, if the mixed input signal includes several signals and the center frequencies of them are located adjacently each other, only one local peak may appear in the amplitude spectrum. So it is impossible to estimate amplitude and frequency of the signals accurately.
Signals in the real world are generally not steady but a characteristic of quasi-steady periodicity are frequently observed (the characteristic of quasi-steady periodicity means that the periodic characteristic is continuously variable (such signal will be referred to as “quasi-steady signal” hereinafter)). While the Fourier transform is very useful for analyzing periodic steady signals, various problems would be emerged if the discrete Fourier transform is applied to the analysis for such quasi-steady signals.
Therefore, there is a need for a sound separating method and apparatus that is capable of separating a target signal form a mixed input signal containing one or more sound signals and/or unsteady noises.