Field of the Invention
The present invention relates to audio signal processing for sound source separation or noise reduction.
Description of the Related Art
Recently, not only digital video cameras but also digital cameras can shoot a moving image, resulting in an increase in the opportunities of recording a sound. It is difficult to confirm a recorded sound during shooting. When the recorded sound is reproduced after shooting, noise may be included, or a sound other than a voice may be too large and hide the voice to be heard. Hence, there are proposed techniques of separating a desired sound component and an undesired sound component or removing a noise component.
For example, a beam former is known, which processes a plurality of microphone signals, enhances a desired sound, and extracts it using the directivity of the sound source.
As a method of removing wind noise mixed during shooting, there is also proposed a method of performing nonnegative matrix factorization for a plurality of audio signals parallelly picked up. In this method, highly correlated bases between the bases of one audio signal of the plurality of audio signals and those of another audio signal are extracted as noise components, and the noise components are reduced.
However, some digital cameras include only one microphone and cannot apply the above technique using a plurality of microphone signals to separate the desired sound or remove unsteady noise such as wind noise.
As a sound source separation technique for a single channel, there is known a method using nonnegative matrix factorization that is described in detail in literature 1. However, to reduce noise using signals separated by nonnegative matrix factorization, it is necessary to cluster the separated signals depending on whether they are noise signals or desired sound signals using a prelearned dictionary or the like.
Literature 1: Paris Smaragdis and Judith C. Brown “Non-Negative Matrix Factorization for Polyphonic Music Transcription” 2003 IEEE Workshop on Application of Signal Processing to Audio and Acoustics, Oct. 19-22, 2003