1. Field of the Invention
The present invention relates to a speech processing apparatus, a speech processing method, and a speech processing program.
2. Description of Related Art
A sound emitted in a room is repeatedly reflected by walls or installed objects to generate reverberations. When reverberations are added, the frequency characteristics are changed from an original speech and thus the speech recognition rate in a speech recognition apparatus performing speech recognition may be lowered. In the speech recognition apparatus, since a previously-uttered speech is added to a currently-uttered speech and thus an articulation rate may decrease. Therefore, reverberation reducing techniques of reducing reverberation components from a speech recorded under reverberant environments have been developed.
For example, Japanese Patent No. 4396449 (Patent Document 1) describes a reverberation removing method of acquiring a transfer function of a reverberation space using an impulse response of a feedback path, which is adaptively identified by an inverse filter processing unit, and reconstructing a sound source signal by dividing a reverberation speech signal by the magnitude of the transfer function. In the reverberation removing method described in Patent Document 1, the impulse response indicating the reverberation characteristic is estimated. Here, since the reverberation time ranges from 0.2 seconds to 2.0 seconds which is relatively long, the computational load excessively increases and a processing delay becomes marked. Accordingly, application thereof to speech recognition has not been widely spread.
H-G. Hirsch, Harald Finster, A New Approach for the Adaptation of HMMs to Reverberation and Background Noise, Speech Communication, Elsevier, 2008, 244-263 (Non-patent Document 5) describes a method of preparing a plurality of acoustic models obtained under reverberation environments having different reverberation times in advance and searching for an acoustic model having the highest likelihood in an environment in which a speech is recorded. The reverberation time is a time until reverberation intensity relative to a maximum value is attenuated to a predetermined intensity. In the method described in Non-patent Document 5, speech recognition is performed using the searched acoustic model.