1. Field of the Invention
The present invention relates to the digital recording and playback of speech over the public telephone network. The electronic circuit and methods of the present invention enable performance of the following functions: analog-to-digital conversion, digital-to-analog conversion, bit compression to 24 kilobits per second, placement and receipt of telephone calls while monitoring the line for dial tone, busy, ringing, speed and silence: detection of Dual Tone Multi-Frequency (DTMF, trademark of the American Telephone and Telegraph Company); pause removal, loudness control, speed rate control, speed-up and slow-down. Applications for the circuit and methods of the present invention are voice store-and-forward, voice response, and other voice message switching via digital means.
2. Description of the Prior Art
The circuit and method of the present invention will enable a particular digital coding of speech which will be called Block-Scaled Pulse Code Modulation (BSPCM). A survey paper on the multitudinous prior art methods of digital speech coding is the article "Speech Coding" by James L. Flannigan et al. appearing in the IEEE Transactions on Computers, Volume COM-27 No. 4, April, 1979 at page 710. Generally the range of a speech signal varies slowly compared to its sampling rate. As a consequence, encoding the step size as well as the individual samples allows a desirable reduction in total bit rate. The signal level increases along with the quantization noise level, masking the noise. There are several prior art adaptive coding schemes for speech which are based on this principle. Primary techniques are adapted differential pulse code modulation (ADPCM) and continuously variable slope deltamode (CVSD). One of the major disadvantages of adaptive coding schemes is that the step size of each sample is computed from the step size and magnitude of preceeding samples. When a bit error occurs, or if the preceding information is missing, the step size information may be lost for the following syllable or word. The decoded speech may thusly be badly distorted until a quiescence in the speech waveform resets the step size to its minimum value. Also data cannot, in general, be edited or spliced and signal processing operations (gain control, mixing, thresholding, etc.) cannot be performed on the data directly.
The prior art technique of pulse code modulation (PCM) encoding is especially desirable for editing, signal processing, and signal detection because each encoded sample is independent of preceding samples, and each encoded sample is directly proportional to the amplitude of the signal. Unfortunately, PCM encoding generally requires about twice the bit rate of adaptive coding schemes for equivalent quality voice reproduction. It will later be seen that the coding scheme of the present invention is also PCM-type because it is not adaptive--each sample is proportional to the signal magnitude and not a function of the preceding history of the signal. But, like the adaptive coding schemes, the coding scheme of the present invention will accomplish a reduction in bit rate by adjusting the step size to follow the local dynamic range of the signal. Like PCM, the block-scale pulse code modulation encoding scheme of the present invention will permit splicing, editing, and signal processing functions to be readily performed on the signal.
It is also an aspect of the present invention that the speech output signal will be processed for control of loudness, speech rate, pause, and the mixing of speech. The particular algorithms utilized to effect processing of the digitalized speech are not new. In particular, the so-called "cut and splice" method of slowing or speeding speech without altering pitch as is used in the present invention is not new. Electromechanical versions of this algorithm have been known at least since N. B. Kuchenmeister, German Pat. No. 386,983 "Improvements Relating to the Reproduction of Sounds from Records", June 26, 1930. The principle of repeating short segments of speech to slow speaking rate, or discarding them to speed speaking rate, has also been used to synchronize sound tracks of movie projectors and to produce fast-talking recordings for the blind. A reference to such methods is contained in "Time Compressed Speech: An Anthology in Bibliography in Three Volumes" by Sam Drucker, published by the Scarecrow Press, Metuchen, N.J., 1974. All such speech output signal-processing algorithms and methods have, to the best knowledge of the inventors not been previously applied to Block-Scaled Pulse Code Modulation (BSPCM) digitally encoded speech.
It is another aspect of the present invention that Dual Tone Multi-Frequency (DTMF) signals received on and detected from the telephone interface will be utilized for telephone line supervision. The algorithms for detecting telephone signals such as dial tone, ringback, busy, speech, or silence are old in the art. Additionally, the machine placement of telephone calls by the conversion of digitally encoded DTMF signal tones to analog audio is now being accomplished by certain voice response units and office information systems interfacing across telephone lines. Normally, however, such placement of telephone calls performed by a machine is not fully duplex upon the two wire telephone transmission system. Rather, the machine initiates an outgoing call via DTMF signals and then, after placement of the outgoing call, monitors the telephone line status for the status of the call. The present invention performs status monitoring of the outgoing call for dial tone, ringback, busy, speech and silence totally simultaneously with the progress of the dial-out. Thusly, the placement of telephone calls by the apparatus of the present invention is totally automated, with telephone line status being continually monitored along the way. Such automation permits, for example, that the apparatus of the present invention can automatedly call back later if a telephone line is found to be initially busy.