1. Field
Embodiments relate to a sound source signal processing apparatus and method that perform beamforming using a microphone array.
2. Description of the Related Art
Telephone communication, voice recording, or motion picture capturing using portable digital devices has been popularized.
Various digital devices, such as consumer electronics devices, portable phones, and digital camcorders, and in-car voice recognition apparatus use a microphone to acquire a voice.
An environment in which a sound source is recorded, or a voice signal is input through such digital devices, is often not quiet. Instead, the environment may often include various noises and surrounding interference sounds.
A microphone exhibiting high directivity, i.e., a unidirectional microphone, may be used or the distance between the microphone and a speaker may be decreased to better capture the voice of the speaker in such an environment. When the distance between the microphone and the speaker is increased, surrounding noises or reverberations as well as the voice of the speaker may enter the microphone, resulting in low signal-to-noise ratio (SNR).
For this reason, technology of a beamformer to form a beam in a specified direction using two or more microphone arranged in an array, instead of reducing the distance between the microphone and the speaker, has been developed.
The beamformer finds the direction of a sound using a time difference between signals reaching the respective microphones arranged in the array and intensifies only a voice signal located in the specified direction or removes unnecessary interference noise. In this case, at least two microphones are arranged in the array, and the positions of the respective microphones and the distance between the microphones are preset.
Using such beamformer technology, efficiency in sound separation or speaker localization to remove or separate a noise source from the speaker may be improved, and noise or reverberation having no directivity may be reduced through post filtering.
That is, voice signals from long distances are acquired using the microphone array to emphasize or suppress voice signals input in a specified direction and to remove sound in the other directions.
The beamformer serves as a spatial filter to filter only a signal in a specified spatial region. How much a beam width is formed in a direction in which the beamformer is directed is connected directly with the resolution performance of the beamformer. Here, the beam width is indicated as a half power beam width, at which approximately 3 dB is reduced in the directed direction. The beam width of a delay-and-sum beamformer is as follows.
      HPBW    θ    ≅      2    ⁢                  ⁢                  sin                  -          1                    ⁡              (                                            3              2                                ⁢                      c                          π              ⁢                                                          ⁢              Ndf                                      )            
Where, N indicates the number of microphones constituting the microphone array. The resolution performance is proportional to the size of the microphone array and frequency. That is, large size of the microphone array and high frequency of a target sound source provide high resolution performance. The distance d between the microphones constituting the microphone array may satisfy the following conditions to prevent spatial aliasing.
            f      u        =          c              2        ⁢        d              ,            d      ≤              c                  2          ⁢          f                      =          λ      2      
Where, λ indicates the wavelength of a signal, and c indicates the speed of the signal.
This is distinguished only when a phase difference caused by time delay between the neighboring microphones is 2π or less.
That is, when the size of the microphone array is not sufficiently large, the beamformer may not exhibit an effect with respect to a low frequency band signal.
In particular, the beamformer technology may be properly applied to a voice signal having a frequency of 1000 Hz or less. In this case, the number of the microphones in the microphone array may be increased. However, the increase in number of the microphones leads to the increase in manufacturing costs. Also, if the number of the microphones is increased, the size of the microphone array is increased with the result that an installation space may be insufficient.