The present invention relates generally to the field of digital transmission systems for transmitting speech signals and somewhat more specifically to the field of conditioning speech signals so that they can be used as inputs to computerized systems. More particularly, the present invention extracts from an analog speech signal amplitude and frequency information and converts that information to digital data and vice versa.
Ordinary sampled speech signals, particularly those sampled at a low sample rate, have an unnatural sound and are fatiguing to listen to over long periods of time. It is therefore desirable to process these types of signals in such a way that the components, generally amplitude and frequency components, that would add to the quality of the reproduced speech signals are made a part of the sampled signal. It is also desirable to remove the primary information (amplitude and frequency) information from the speech signal without retaining the speech signal itself or any modified form of the speech signal.
A publication of interest for its teachings of conditioning speech signals is entitled "Automatic Conditioning of Speech Signals", IEEE Trans. on Audio, etc., June 1968, pp. 169-179, by H. Ellwarth et al. In that paper, it is disclosed that a speech signal may be digitized and the amplitude clipped with the result that the remaining signal will still bear a resemblance to the original signal such that the signal may be processed to produce recognizable speech.
A patent of interest for its teaching on clipped speech signal processing is U.S. Pat. No. 4,477,925, entitled "Clipped Speech-Linear Predictive Coding Speech Processor", by J. M. Avery et al. which patent is assigned to NCR Corporation, the assignee of the present application. In this patent, there is disclosed a system and a method which analyzes sampled clipped speech signals for the purpose of identifying the original utterance. A sampler generates, from the clipped signal, a plurality of discrete binary values. A processor is used to analyze the sampled binary values to compare them against stored digitized signals corresponding to a known spoken utterance. Comparisons are made using linear predictive coefficients of an autocorrelation function of the utterances.
Another patent of interest is U.S. Pat. No. 4,015,088, entitled "Real-Time Speech Analyzer", by J. J. Dubnowski. The analyzer disclosed in this patent analyzes digital signal representations of a speech signal, which signal is threshold center-clipped and infinite peak-clipped to form a signal comprising three logic states. An autocorrelation function of the formed signal is determined by a circuit which then employs circuitry for continuously determining the pitch period of the applied speech signal.
Another patent of interest for showing the state of the art of speech processing using clipping is U.S. Pat. No. 3,974,336, entitled "Speech Processing System", by E. M. O'Brien. In this patent, it is taught that speech signals can be quantized in amplitude and time and a square wave squelch signal of relatively high frequency can be added and the sum amplified, clipped and quantized in time. By utilizing the proper detection circuitry and the squelch signal, the speech signal can be separated and cleanly removed from noise signals which occur between words.
Another patent of interest for its showing of a method for digitizing clipped speech and for squelching noise between words is U.S. Pat. No. 4,271,332, entitled "Speech Signal A/D Converter Using an Instantaneously - Variable Bandwidth Filter", by James C. Anderson.
Another patent of interest for its teaching is U.S. Pat. No. 4,070,550, entitled "Quantized Pulse Modulated Nonsynchronous Clipped Speech Multichannel Coded Communication System", by R. H. Miller, Jr. et al. In FIG. 5 of the patent, the process of speech clipping and digitizing an analog input waveform to impart amplitude and frequency information to a facsimile digital signal is shown in block and waveform illustrations.
As previously mentioned, sampled speech signals at low data rates have a very flat sound with noise between words which is unnatural and fatiguing to listen to. It therefore would be desirable to have a system which utilizes the advantages of digital speech signal processing but which adds to the reconstructed signal amplitude information and frequency information for reconstructing the naturalness and dynamic range of the reconstructed signal.