Digital encoding of speech signals and/or decoding of digital signals to provide intelligible speech signals are important for many electronic products providing secure communications capabilities, communications via digital links or speech output signals derived from computer instructions.
Many digital voice systems suffer from poor perceptual quality in the synthesized speech. Insufficient characterization of input speech basis elements, bandwidth limitations and subsequent reconstruction of synthesized speech signals from encoded digital representations all contribute to perceptual degradation of synthesized speech quality. Moreover, some information carrying capacity is lost; the nuances, intonations and emphases imparted by the speaker carry subtle but significant messages lost in varying degrees through corruption in en- and subsequent de-coding of speech signals transmitted in digital form.
In particular, auto-regressive linear predictive coding (LPC) techniques comprise a system transfer function having all poles and no zeroes. These prior a coding techniques and especially those utilizing linear predictive coding analysis tend to neglect all resonance contributions from the nasal cavities (which essentially provide the "zeroes" in the transfer function describing the human speech apparatus) and result in reproduced speech having an artificially "tinny" or "nasal" quality.
Standard techniques for digitally encoding and decoding speech generally utilize signal processing analysis techniques which require significant bandwidth in realizing high quality real-time communication.
What are needed are apparatus and methods for rapidly and accurately characterizing speech signals in a fashion lending itself to digital representation thereof as well as synthesis methods and apparatus for providing speech signals from digital representations which provide high fidelity and conserve digital bandwidth requirements.