1. Field of the Invention
The present invention relates to a variable coding rate speech coding apparatus and a speech coding rate selector used with a portable telephone, an internet telephone, etc.
2. Description of the Related Art
There has been proposed a high-efficiency speech coding apparatus for compressing data to be transmitted through a portable telephone or the like. A portable telephone based on the code division multiple access (CDMA) system has become commercially practical that makes the speech coding rate variable to control an average coding rate as low as possible thereby to accommodate more subscribers.
The speech coding apparatus with variable coding rate is adapted to determine the presence of a speech by a speaker by a speech detector and to employs a higher coding rate while the speaker is speaking (hereinafter referred to as a xe2x80x9cvoiced periodxe2x80x9d) so as to maintain higher speech quality. On the other hand, speech coding apparatus with the variable coding rate employs a lower coding rate while the speaker is silent (hereinafter referred to as an xe2x80x9cunvoiced periodxe2x80x9d) thereby to reduce the average coding rate. The section that selects the speech coding rate as mentioned above in the speech coding apparatus with the variable coding rate is designated a speech coding rate selector (Related literature: TIA/EIA/IS-96B: Speech Service Option Standard for Wideband Spread Spectrum Systems).
In designing the aforesaid speech coding rate selector, the performance of the speech detector for distinguishing the voiced period from the unvoiced period is an important factor. The speech detector is required to accurately detect the voice of a speaker (hereinafter referred to as xe2x80x9cspeechxe2x80x9d) among diverse acoustic signals entered through a microphone such as a portable telephone. The biggest obstacle in detecting speech is a variety of ambient noises coming into the microphone in the environment where the portable telephone is located. Such ambient noises include, for example, an engine noise and a noise produced by the wind hitting the car windows in case of a traveling car, and train running noises in a station premise or the like. These noises enter the speech detector as ambient noises, frequently causing the speech detector to misjudge them as speech. For this reason, when a portable telephone is used in an environment with loud ambient noises, the speech detector erroneously determined an unvoiced period as a voiced period, resulting in an excessively high speech coding rate. This has caused uncomfortable sound to be produced at a receiver side and also caused a subscriber capacity to be reduced in an entire portable telephone system or the power consumed by a portable telephone terminal to be increased.
Conversely, there have been cases where speaker""s speech is misjudged as an ambient noise in an environment with loud ambient noises. The low coding rate mode of the speech coding apparatus with the variable coding rate is incapable of performing coding while maintaining sufficiently high speech quality. In some cases, the speech gain is suppressed to reduce the audibility of ambient noises during an unvoiced period. Hence, misjudgment of speech as an ambient noise causes the speech coding apparatus with variable coding rate to operate at the low coding rate, leading to markedly deteriorated speech quality.
Hitherto, in order to solve the problems described above, there has been proposed a method in which a noise eliminator or a noise suppressor (hereinafter referred to as xe2x80x9cnoise eliminators or the likexe2x80x9d) is installed in a stage preceding the speech detector, and this method has proved to be effective to a certain extent. Many of these noise eliminators or the like, however, require a system having a large circuit scale or arithmetic processing as in the fast Fourier transform (FFT). This has frequently adversely affected an attempt to reduce the size and power consumption of portable telephone terminals.
Accordingly, an object of the present invention is to provide a speech coding rate selector and a speech coding apparatus that do not require a large-scale circuit or arithmetic processing.
To this end, a speech coding rate selector in accordance with the present invention has: a speech input unit for receiving an input speech; a short-term power arithmetic unit for computing the power of input speech at a predetermined time unit; an ambient noise power estimating unit for estimating the power of an ambient noise superimposed on an input speech; a rate selection threshold value arithmetic unit for computing a power threshold value group for selecting a speech coding rate from the result of the ambient noise power estimation; a power comparator that compares the power determined by the short-term power arithmetic unit with the threshold value group determined by the rate selection threshold value arithmetic unit to select one appropriate rate from among a plurality of speech coding rates; an ambient noise property inferring unit for inferring the property of an ambient noise superimposed on an input speech; and a comparison power corrector for correcting an output value of the short-term power arithmetic unit if an ambient noise inferred by the ambient noise property inferring unit proves to exhibit great time-dependent variation in power.
The speech coding apparatus has: a speech input unit for receiving input speech; a speech coding rate selector for selecting an appropriate speech coding rate according to the power of input speech; a speech analyzer for processing input speech to estimate a transfer function of a speaker""s oral cavity; a speech coding unit that makes a synthesis filter based on the transfer function of the oral cavity according to the estimation result supplied by the speech analyzer and codes an excitation signal of the synthesis filter; and a gain suppressor that is inserted between the speech input unit and the speech coding unit and suppresses the gain of a signal supplied from the speech input unit to the speech coding unit in an unvoiced period according to the information from the speech coding rate selector.