Frequency Modulation (FM) synthesis is a technique for generating complex sound spectra such as synthesized musical instrument and vocal sounds. Such synthesized sounds are typically comprised of formants which, in some conventional FM techniques, are approximated as harmonics of a modulation frequency. In circumstances in which the formant frequency and modulation frequency are static (i.e., do not change over time), the harmonics of the modulation frequency are also static. However, FM synthesis of the human voice, with its wide prosodic and expressive variations in pitch and timbre, requires changes in either the underlying modulation frequency, or one or more of formant frequencies, or both.
FIG. 1A illustrates a fast Fourier transform (FFT) based spectrograph 100 with a sampling window having a width of 4096 samples and a 48 kHz sampling rate. The spectrograph is a representation of sound produced using a conventional FM synthesis technique (e.g., one in which each formant is approximated by a single harmonic oscillator) to synthesize a sequence of phonemes in a human-voice timbre. Specifically, the sequence of phonemes in this example is a vowel alteration of the sounds “ee-oo-ee-oo.” In this example, the vowel alteration creates excursions in the underlying modulation frequency and/or formant frequencies that manifest as artifacts 102 in spectrograph 100. These artifacts are perceived by the listener as audible clicking sounds.
FIG. 1B similarly illustrates a fast Fourier transform (FFT) based spectrograph 104 with a sampling window having a width of 4096 samples and a 48 kHz sampling rate. Here, however, the spectrograph represents a synthesis of a human voice undergoing vibrato, in which one or more formant frequencies in the generated sound vary periodically with time. For example, formant 106, as shown in FIG. 1B, varies at approximately 3 Hz with an amplitude 108. For small vibrato amplitude, no artifacts are introduced. However, for large vibrato amplitude, artifacts 110 are introduced. The artifacts 110 are an example of what are referred to herein as “type-1” artifacts, which is understood to mean artifacts originating from changes to (or shifts in) the frequency of the signal generated by a harmonic oscillator. Similar problems occur in conventional methods when attempting to synthesize portamento and glissando sound effects.
Accordingly, there is a need for FM synthesis techniques that produce artifact-free (sometimes herein called “glitch-free”) sound when the modulation frequency and/or one or more formant frequencies varies over time.