The present invention relates to the modification of the time duration of an audible waveform and more particularly to the compression of a voiced audio waveform to fit within predetermined time boundaries, while preserving the intelligibility and quality of the information contained in the waveform.
In the prior art there are two general methods of speeding up (time compressing) a recorded sample of speech: (1) by increasing the speed of playback which raises all the frequencies by an amount equal to the ratio of the speed-up, and (2) by sampling in short segments and reassembling only a portion of the segments. In the second method, a chopping technique may be used to remove some of the short sample segments. If the gaps in the recorded speech are removed, an increase in the speech rate with no great loss in intelligibility, for small chops, may result. This latter method does not have the disadvantage of shifting the frequency of the speech spectrum with acceleration.
Such a chopping technique has been used to either expand or compress an audio waveform by indiscriminately chopping out or duplicating portions of the waveform to expand or compress the audio waveform to a desired length. The sound produced from a waveform which has had indiscriminately selected portions chopped out or duplicated is generally of poor quality. This is so, due to the step transients which result from the utilized implementation, for example pulse code modulation (PCM) techniques which result in a clicking sound where the chopped portions of the speech segments are joined together.
In the prior art, there are a number of speech compression patents, both analog and digital which present alternative methods for achieving time compression of an audio signal. U.S. Pat. No. 3,504,352 to Stromswold, et al discloses a time compression system in which an analog input signal is stored in an analog memory rather than a digital memory, with the only digital aspect of the invention being directed to the timing sequence for reading information from the analog memory. U.S. Pat. No. 3,803,363 to Lee, discloses a time expansion or compression system for audio data in which the audio information is converted to digital form using standard A/D technology. Segments of the speech are removed by reading the signals into a memory at one rate and out at another rate. U.S. Pat. No. 3,104,284 to French et al, discloses a system for modifying the time duration of an audible waveform by expanding or compressing the audio waveform to fit within a predetermined time boundary. A standard A/D conversion technique is utilized for subtracting certain segments of the speech from the speech communication system wherein a speech signal is converted to binary signal form. Redundant portions of the binary signal are extracted and converted to delta modulated form. The binary signal, absent the redundant portion, and the delta modulated redundant signal are transmitted to a remote location, decoded and recombined to form an analog speech signal. There is, however, no teaching of a voice high speed replay system utilizing delta modulation techniques.
According to the present invention, an audio compression system is disclosed utilizing delta modulation techniques. An audio signal is delta modulation encoded and stored, with the encoded signal being checked for positive and negative zero crossovers to determine which portions of a speech segment should be deleted. The gain factor for each segment of the undeleted speech signals are matched such that step transients and the attendant clicking sounds are eliminated.