The present invention relates to a method and an apparatus for separating a sound-source signal and a method and a device for detecting the pitch of the sound-source signal. More particularly, the present invention relates to a method and an apparatus for separating one audio signal from among audio signals from a plurality of sound sources with stereomicrophones, and a method and a device for detecting the pitch of the audio signal.
Techniques for separating a target sound-source signal from an audio signal that is a mixture of a plurality of sound-source signals are known. For example, as shown in FIG. 26, voices emitted from three persons SPA, SPB, and SPC are picked up by acoustic to electrical conversion means, such as left and right stereomicrophones MCL and MCR, as an audio signal, and an audio signal from a target person is separated from the picked up audio signal.
For example, Japanese Unexamined Patent Application Publication No. 2001-222289 discloses one of the known sound-source signal separating techniques which utilizes an audio signal separating circuit and a microphone employing the audio signal separating circuit. In the disclosed technique, a plurality of mixed signals, each mixed signal containing the linear sum of a plurality of mutually independent linear sound-source signals, are frame divided, and the inverses of mixed matrices that minimize correlation of a plurality of signals separated by the separating circuit in connection with zero lag time are multiplied by each other on a per frame basis. An original voice signal is thus separated from the mixed signal.
Japanese Unexamined Patent Application Publication No. 7-28492 discloses a sound-source signal estimating device for estimating a target sound source. The sound-source signal estimating device is intended for use in extracting a target audio signal under a noisy environment.
The pitch of a target sound is determined to separate a sound-source signal. As a technique to detect pitch, Japanese Unexamined Patent Application Publication No. 2000-181499 discloses an audio signal analysis method, an audio signal analysis device, an audio signal processing method and an audio signal processing apparatus. According to the disclosure, an input signal having a predetermined duration of time is sliced every frame, a frequency analysis is performed for each frame, and a harmonic component assessment is performed based on the result of the frequency analysis for each frame. A harmonic component assessment is performed on the inter-frame difference in the amplitudes in the results of frequency analysis for each frame. The pitch of the input signal is thus detected using the result of the harmonic component assessment.
Microphones more in number than the sound sources are required to separate a plurality of sound-source signals. The use of a plurality of microphones is actually being studied. For example, Japanese Unexamined Patent Application Publication No. 2001-222289 discloses that separating a sound-source signal from three or more sound-sources using two microphones is difficult. Japanese Unexamined Patent Application Publication No. 7-28492 discloses a technique to extract an audio signal from a target sound source using a plurality of microphones (a microphone array). According to these disclosed techniques, more microphones than the number of sound sources are required to separate a target sound-source signal from a mixed signal of a plurality of sound-source signals.
In accordance with known techniques, stereomicrophones used in a mobile audio-visual (AV) device, such as a video camera, have difficulty in separating three or more sound-source signals.
When the pitch of a target sound is determined prior to the separation of the sound-source signals, the pitch detection is preferably appropriate for the separation of the sound-source signals.