The present invention relates to a system for digitizing human speech and in particular to a system that provides excellent speech reproduction utilizing a minimum number of resolution bits.
In general, speech digitizers are adapted to digitize human speech by sampling an incoming speech signal at a predetermined rate and generating a parallel digital bit stream in accordance with the amplitude characteristics of the speech signal at each sample point. The digital bit stream is then either transmitted to a remote location and converted back to speech by a "de-digitizer" or stored for future conversion in a digital memory. The primary function of a speech digitization system is the fabrication of phrases and sentences from a prestored vocabulary of selected words. This is readily accomplished with a digital system by selectively addressing appropriate memory locations in the digital word memory. Thus, it can be seen that speech digitization systems are ideally suited for applications desiring a human sounding voice and requiring only a limited vocabulary.
The restriction on vocabulary results primarily from the memory capacity required to digitally store an appreciable number of words. For example, assuming a system having eight bits of resolution and operating at a sample rate of 6000 times/second, a single one syllable word may require as many as 24,000 bits of storage. Accordingly, it can be readily appreciated that with conventional speech digitization systems, the storage of a substantial word vocabulary becomes prohibitive.
The present invention seeks to minimize these storage constraints by providing a speech digitization system that utilizes a minimum number of resolution bits without sacrificing the quality of the speech produced. In particular, the present invention is adapted to produce high quality human speech using as few as two bits of resolution. As will subsequently be described in greater detail, this is accomplished by providing a companded digitization system that maximizes the information content of each bit in the digital bit stream. This results in part from the inclusion of a novel amplitude function generator that is adapted to maintain substantial average duty cycles on all bits in the digital bit stream even at very low audio input levels. Consequently, the present invention is capable of producing speech quality comparable to systems having twice the number of resolution bits.
In a first embodiment of the present invention the filterd voice signal is provided to three comparator circuits; a polarity comparator, an upper limit comparator and a lower limit comparator. The polarity comparator circuit determines whether the signal is positive or negative. The upper and lower limit comparator circuits are adapted to compare the magnitude of the incoming vocal signal to a reference level that is varied in accordance with the amplitude function signal, which in turn is generated in accordance with the preceding duty cycle of the digital amplitude bit stream. In particular, as the percentage duty cycle of the digitized output signal increases, the magnitude of the amplitude function increases. Similarly, as the percentage duty cycle of the digital output signal decreases, the magnitude of the amplitude function also decreases. Thus, it will be seen that the reference levels of the upper and lower limit comparator circuits are expanded and compressed (i.e., "companded") in accordance with variations in the amplitude of the incoming vocal signal.
Significantly, it will additionally be seen that the novel manner in which the amplitude function signal is generated by the present invention greatly improves the signal-to-noise (S/N) ratio of the system by increasing the information content of the digital amplitude output signal at high and low audio input levels. Specifically, in the prefered embodiment of the present invention, the amplitude function signal is developed so as to maintain substantial duty cycles (i.e., around 50%) on the digital amplitude output signal.
In general, this is accomplished by averaging the digital amplitude output signal and providing the resulting analog signal to one of the inputs of a buffer amplifier circuit that has its other input connected to a bias network. The presence of the bias network serves to center the duty cycle spread of the digital amplitude output signal around 50%. More particularly, due to the closed loop feedback configuration of the system and the amplitude bias injection, the amplitude function produced at the output of the buffer amplitude circuit is maintained substantially around a 50% average duty cycle at the digital amplitude output. In addition, the gain of the buffer amplifier circuit is selected in the preferred embodiment so that the upper and lower duty cycle limits of the digital amplitude output signal will be approximately 40% and 60%. Thus, it will be seen that the duty cycle of the digital amplitude output signal is maintained within a range wherein its information content is statistically maximized.
In a second embodiment of the present invention, a system having four bits of resolution is disclosed that is capable of producing remarkable speech reproduction. In other words, the second embodiment produces a four-bit parallel digital output signal each time the vocal input signal is sampled, rather than a two-bit signal as in the first embodiment. The same type of amplitude function generator is utilized, however, to maximize the information content of the digital output. The amplitude function signal in this embodiment is provided to a multiplying digital-to-analog (D/A) converter that is adapted to convert the four-bit digital approximation signal to an analog signal which is then scaled in accordance with the amplitude function signal. The resulting signal is then provided to the reference input of a comparator circuit which has its other input connected to receive the sampled incoming audio signal. The comparator output is supplied to the input of a successive approximation register (SAR) that is adapted to produce a four-bit digital approximation of the sampled audio signal. The four-bit digital approximation is then latched through a quad latch buffer to produce the four-bit parallel digital amplitude output signal that is stored in memory.
Additional objects and advantages of the present invention will become apparent from a reading of the detailed description of the preferred embodiment which makes reference to the following set of drawings in which :