1. Technical Field of the Invention
The present invention relates to technology for processing a voice signal.
2. Description of the Related Art
A technology for converting voice characteristics is proposed, for example, in Japanese Patent Application Laid-Open Publication No. 2014-002338 (hereinafter referred to as “JP 2014-002338”). This reference discloses a technology for converting voice characteristics of a voice signal that is a processing target (hereinafter referred to as “target signal”) into distinguishing (non-modal or non-harmonic) voice characteristics such as gruffness or hoarseness. In the technology disclosed in JP 2014-002338, a spectrum of a target voice signal that has been adjusted to a fundamental frequency of an object signal is divided into segments comprising a plurality of bands (hereinafter referred to as “unit bands”), with a harmonic frequency residing at a center of each of the unit bands, and each component of each of the unit bands then being reallocated along a frequency axis. Next, amplitude and phase are adjusted for each of the unit bands such that an amplitude and phase of a harmonic frequency in each of the reallocated unit bands corresponds to an amplitude and phase of the target signal.
In the technology disclosed in JP 2014-002338 the amplitude and phase for each unit band is adjusted after a plurality of unit bands has been defined such that an intermediary point between a harmonic frequency and a next adjacent harmonic frequency on a frequency axis constitutes a boundary. A drawback of this technique is that an amplitude and phase at the boundary of each unit band (i.e., at the intermediary point between adjacent harmonic frequencies) become discontinuous. Presuming generation of a voice that has a predominance of harmonic components over non-harmonic components, with respect to the intermediary point between harmonic frequencies (i.e. at the point in which there is sufficiently low intensity) of the generated voice, any discontinuity in amplitude and phase of the non-harmonic components will hardly be perceived by a listener. However, where a particular subject voice that has a predominance of non-harmonic components, such as in the case of a gruff or hoarse voice, a discontinuity in the amplitude and phase at the intermediary point between harmonic frequencies becomes apparent, with the result that an acoustically unnatural voice may be perceived by the listener.