The present invention relates to methods and apparatus for encoding and constructing signals, and it is particularly, but not exclusively, concerned with the encoding of speech signals or waveforms.
Electrical waveforms derived from human speech are extremely complex in character, having significant components extending from below 300 Hz to above 3 kHz and a wide dynamic range. Such waveforms may be digitized by such known methods as pulse-code modulation, delta modulation or the use of vocoders. These techniques are discussed by L. S. Moye in a paper entitled "Digital Transmission of Speed at Low Bit Rates", Electrical Communication, Volume 47, Number 4, 1972.
It is known that if a speech waveform is infinitely clipped, that is converted into a square wave with zero crossings corresponding to those of the original waveform, the clipped wave is intelligible, when converted back to sound, but severely distorted. In an effort to improve both the intelligibility and naturalness of infinitely clipped speech, the speech waveform has been differentiated before clipping. Although this yields speech of high intelligibility, the number of zero crossings in the resulting square waveform is greatly increased.
The recording or transmission of the square waveform resulting from infinite clipping of speech is equivalent to the signalling of a sequence of time intervals (between successive zero crossings in such a wave) since the amplitude is purely arbitrary. Such intervals have each been converted into a number representing the duration of each interval (see U.K. Patent Specifications Nos. 1,282,641 and 1,296,199 and U.S. Pat. No. 3,684,829 equivalent to the former British specification) but subsequent reconstruction of speech from this sequence of numbers, although an easy matter, is not successful. It is known that the speech sounds so reconstructed are of poor quality and the successive time intervals must be reproduced quite exactly if still further serious deterioration of the reconstructed speech waveform is not to occur. Thus each specifying number must have many binary digits, and allowing for a typical average figure of about one thousand such numbers per second to specify the speech, the binary rate (bits/second) needed to represent the speech waveform is as high as with conventional methods of digital encoding, yet with poorer resultant speech quality.
Attempts to improve speech quality by differentiation before encoding result in more zero crossings; about 1500 to 2000 per second on average. Therefore more numbers per second are required to specify the speech. Improved quality is bought at the cost of still higher bit rates.
Techniques of non-linear coding are known (see the above mentioned Patent Specifications) which reduce the set of distinct numbers required for specifying interval durations, but even when these techniques are applied the bit rate remains high for relatively poor speech quality.