This invention relates to speech synthesis, and more particularly to mitigation of amplitude quantization or other artifacts in synthesized speech signals.
One recent approach to computer-implemented speech synthesis makes use of a neural network to process a series of phonetic labels derived from text to produce a corresponding series of waveform sample values. In some such approaches, the waveform sample values are quantized, for example, to 256 levels of a μ-law non-uniform division of amplitude.