The present technology relates to a signal processing device and method and a program, and specifically relates to a signal processing device and method and a program of enabling removal of noise occurring in recording voice in high accuracy.
From among apparatuses for recording voice (including moving pictures) are known a video camera, a digital camera with a function of capturing moving pictures, a smart phone, an IC recorder and the like. In operation of these apparatuses, sound occurring from the apparatus body sometimes contaminates in the recorded voice.
For example, zoom driving sound, autofocus driving sound, aperture stop driving sound and the like occur in capturing a moving picture. These sounds occur due to driving of components inside the apparatus and have various acoustic characteristics according to driving manners and control manners.
Moreover, a piezoelectric element deforming in response to applied voltage is often used for driving of lenses according to autofocusing and zooming in recent years. Driving sound due to the piezoelectric element sometimes has different characteristics from existing ones.
Noise caused by such driving sound is occasionally called sudden noise. The sudden noise contaminating in the recorded voice is exceedingly grating on the ears and expects a measure for lowering the sound, a measure for noise removal or the like.
Some measures against the sudden noise have been proposed.
For example, a technology is proposed for generating a combined voice signal from a voice signal which is in a period prior to timing when a drive signal is transmitted in response to the drive signal having been transmitted and combining the combined voice signal with a voice signal which is in a period posterior to the timing when the drive signal is transmitted (for example, Japanese Patent Laid-Open No. 2011-002723 which is hereinafter referred to as Patent Literature 1).
Moreover, a technology is also proposed for extracting a frequency component characteristic of driving of an optical element from output voice from a microphone within a certain period from a drive command, detecting a section where it has a certain level or more, and performing prediction and interpolation based on the voice before and after the section (for example, Japanese Patent Laid-Open No. 2012-114842 which is hereinafter referred to as Patent Literature 2). Thereby, driving noise along with driving of an imaging optical system can be removed in high accuracy.