1. Field of the Invention
The present invention relates to a variable coding rate speech coding apparatus and a speech coding rate selector used with a portable telephone, an internet telephone, etc.
2. Description of the Related Art
There has been proposed a high-efficiency speech coding apparatus for compressing data to be transmitted through a portable telephone or the like. A portable telephone based on the code division multiple access (CDMA) system has become commercially practical. This telephone makes the speech coding rate variable to control the average coding rate to be as low as possible thereby accommodating more subscribers.
The speech coding apparatus with a variable coding rate is adapted to determine the presence of speech by a speaker using a speech detector, and to employ a higher coding rate while the speaker is speaking (hereinafter referred to as a xe2x80x9cvoiced periodxe2x80x9d) so as to maintain higher speech quality. On the other hand, speech coding apparatus with the variable coding rate employs a lower coding rate while the speaker is silent (hereinafter referred to as an xe2x80x9cunvoiced periodxe2x80x9d) thereby reducing the average coding rate. The section that selects the speech coding rate as mentioned above in the speech coding apparatus with the variable coding rate is designated a speech coding rate selector (Related literature: TIA/EIA/IS-96B: Speech Service Option Standard for Wideband Spread Spectrum Systems).
In designing the aforesaid speech coding rate selector, the performance of the speech detector for distinguishing the voiced period from the unvoiced period is an important factor. The speech detector is required to accurately detect the voice of a speaker (hereinafter referred to as xe2x80x9cspeechxe2x80x9d) among diverse acoustic signals entered through a microphone such as a portable telephone. The biggest obstacle in detecting speech is the variety of ambient noise coming into the microphone in the environment where the portable telephone is located. Such ambient noise includes, for example, engine noise and noise produced b the wind hitting the windows of a traveling car, and train running noise in a station premise or the like. This noise enters the speech detector as ambient noise, frequently causing the speech detector to misjudge it as speech. For this reason, when a portable telephone is used in an environment with loud ambient noise, the speech detector erroneously determines an unvoiced period as a voiced period, resulting in an excessively high speech coding rate. This has caused uncomfortable sound to be produced at a receiver side and also caused subscriber capacity to be reduced in an entire portable telephone system or the power consumed by a portable telephone terminal to be increased.
Conversely, there have been cases where a speaker""s speech is misjudged as ambient noise in an environment with loud ambient noise. The low coding rate mode of the speech coding apparatus with the variable coding rate is incapable of performing coding while maintaining sufficiently high speech quality. In some cases, the speech gain is suppressed to reduce the audibility of ambient noise during an unvoiced period. Hence, misjudgment of speech as ambient noise causes the speech coding apparatus with a variable coding rate to operate at the low coding rate, leading to markedly deteriorated speech quality.
Hitherto, in order to solve the problems described above, there has been proposed a method in which a noise eliminator or a noise suppressor (hereinafter referred to as xe2x80x9cnoise eliminators or the likexe2x80x9d) is installed in a stage preceding the speech detector, and this method has proved to be effective to a certain extent. Many of these noise eliminators or the like, however, require a system having a large circuit scale or arithmetic processing as in the fast Fourier transform (FFT). This has frequently adversely affected an attempt to reduce the size and power consumption of portable telephone terminals.
Accordingly, an object of the present invention is to provide a speech coding rate selector and a speech coding apparatus that do not require a large-scale circuit or arithmetic processing.
To this end, a speech coding rate selector in accordance with the present invention has: a speech input unit for receiving an input speech; a short-term power arithmetic unit for computing the power of input speech at a predetermined time unit; an ambient noise power estimating unit for estimating the power of an ambient noise superimposed on an input speech; a rate selection threshold value arithmetic unit for computing a power threshold value group for selecting a speech coding rate from the result of the ambient noise power estimation; a power comparator that compares the power determined by the short-term power arithmetic unit with the threshold value group determined by the rate selection threshold value arithmetic unit to select one appropriate rate from among a plurality of speech coding rates; an ambient noise property inferring unit for inferring the property of an ambient noise superimposed on an input speech; and a comparison power corrector for correcting an output value of the short-term power arithmetic unit if an ambient noise inferred by the ambient noise property inferring unit proves to exhibit great time-dependent variation in power.
The speech coding apparatus has: a speech input unit for receiving input speech; a speech coding rate selector for selecting an appropriate speech coding rate according to the power of input speech; a speech analyzer for processing input speech to estimate a transfer function of a speaker""s oral cavity; a speech coding unit that makes a synthesis filter based on the transfer function of the oral cavity according to the estimation result supplied by the speech analyzer and codes an excitation signal of the synthesis filter; and a gain suppressor that is inserted between the speech input unit and the speech coding unit and suppresses the gain of a signal supplied from the speech input unit to the speech coding unit in an unvoiced period according to the information from the speech coding rate selector.