1. Field of the Invention
The present invention relates to a speech coding and decoding system for use in a digital data radio transmission technique for a mobile communication such as PDC (Personal Digital Cellular Telecommunication Systems) or the like, and more particularly to a speech coding and decoding system which has a VOX (Voice Operated Transmitter) function of judging a speech sound (e.g., sound when speaking)or a no-speech sound (e.g., sound when not speaking, that is, the interval between speech) in a signal from a transmitter, so to create a background noise depending on the speech sound/no-speech sound. Background noise means the background sounds other than the speech regardless of the speech sound and no-speech sound. That is, the speech and background noise are included in the speech sound. The no-speech sound is only a background noise.
2. Description of the Related Art
This kind of speech coding and decoding system consists of a speech coding system for coding an input sound and transmitting the same and a speech decoding system for decoding the coded signal received from the speech coding system and reproducing speech. The structure of the speech coding system is shown in FIG. 5 and the structure of the speech decoding system is shown in FIG. 6.
With reference to FIG. 5, the conventional speech coding system 50 comprises a speech/no-speech sound judging unit 51 for judging whether an input speech signal belongs to a speech sound or a no-speech sound, so to supply its discriminate signal, an LPC analyzing unit 52 for calculating an LPC (Linear Predictive Coding) parameter for the input speech signal, an efficient coding processing unit 53 for performing coding processing based on the LPC parameter, a unique word generating unit 54 for supplying a unique word control signal depending on the type of the input speech signal, and a switch controller 55 for performing a switching control upon receipt of the discriminate signal supplied from the speech/no-speech sound judging unit 51, the coded speech signal supplied from the efficient coding processing unit 53, and the unique word control signal supplied from the unique word generating unit 54. Of the above components, as a unique word control signal, the unique word generating unit 54 supplies a preamble signal when an input speech signal belongs to a speech sound, and supplies a postamble signal when an input speech signal belongs to a no-speech sound. The switch controller 55 supplies a preamble signal and a coded speech signal as a coded signal in case of a speech sound and supplies a postamble signal and a background noise as a coded signal in case of a no-speech sound, according to the discrimination result of a speech sound or a no-speech sound by a discriminate signal. However, the background noise is only supplied by the first frame, and thereafter only a postamble signal is supplied.
With reference to FIG. 6, the conventional speech decoding system 60 comprises a speech/no-speech sound judging unit 61 receiving a coded signal supplied from the speech coding system 50 for judging whether the speech based on the coded signal belongs to a speech sound or a no-speech sound, an LPC decoding unit 63 for decoding the LPC parameter, and an efficient decoding processing unit 64 for supplying a speech signal decoded by use of the LPC parameter calculated by the LPC decoding unit 63. Of the above components, the speech/no speech sound judging unit 61 makes a judgement whether it is a speech sound or a no-speech sound based on a unique word control signal included in an input coded signal, that is a preamble signal or a postamble signal. When it is judged to be a speech sound, the input coded signal is sent to the LPC decoding unit 63. When it is judged to be a no-speech sound, if one frame of a background noise has been included in the input coded signal, the background noise is sent to a background noise storing unit 62 to be stored therein. After storing the background noise into the background noise storing unit 62, if no background noise has been included in the input coded signal, the background noise stored in the background noise storing unit 62 is delivered to the LPC decoding unit 63.
The technique concerned with the above conventional speech coding and decoding is disclosed in, for example, Japanese Patent Publication Laid-Open (Kokai) No. Heisei 7-115403 "Coding and Decoding Circuit of Unvoiced Area Information", No. 7-334197 "Speech Coding System", No. 8-139688 "Speech Coding System".
The above publication No. 7-115403 discloses a technique in which a coding circuit comprises a frequency characteristic extracting unit for extracting frequency characteristic from an analog to digital converted signal to make a pattern, and a minimum error pattern judging unit, having a noise pattern group, for selecting a noise pattern most approximate to the output pattern of the frequency characteristic extracting unit, while a decoding circuit comprises a noise characteristic convolutional operation circuit for performing a convolutional operation of the received and detected minimum error pattern and a white noise pattern, thereby eliminating the risk of deterioration in the quality of a reproduced signal and also eliminating unnecessary processing.
The above publication No. 7-334197 discloses a technique in which speech parameter processing means mounted on a speech coding system voids a long predictive delay depending on the past state in the speech parameters and processes the long predictive gain into a minimum quantized value, to provide an output, thereby enabling a decoding system to create a surrounding noise interpolating in a period of receiving no coded data by use of the coded data received at a constant interval.
The above publication No. 8-139688 discloses a technique for controlling a background noise, comprising an acoustic weighting filter for providing an acoustic weighting speech signal after switching a speech signal or one of the LPF output of a speech signal so to receive it based on the VOX mode information, an electric power quantizer for supplying an electric power index obtained from the long time average of the electric power at the time of a no-speech state based on the VOX mode information, an LPC analyzer for supplying the LPC controlled at an inherent value at the time of a no-speech state, an LPC quantizer for supplying a quantized LSP index and a quantized LPC in case of fixing the LPC at the inherent value at the time of a no-speech state, and an adapted code book retrieval unit for controlling an adapted code book index at the inherent value at the time of a no-speech state, so as not to perform retrieval processing.
In the conventional speech coding and decoding system shown in FIGS. 5 and 6, the speech coding system discriminates between a speech sound and a no-speech sound in an input speech signal, thereafter calculates an LPC parameter regardless of a speech sound or a no-speech sound, and refers to a code book as for the speech. As for noise other than the speech, vocal code data, and pitch data, it refers to a code book after filtering through a synthetic filter.
However, since the code book is created based on the speech sound characteristics, it is not suitable for a reference of noise characteristic at the time of a no-speech sound. Generally, spectrum characteristic of a sound differs at the time of a speech sound (e.g., a person talking) and a no-speech sound (e.g., a person not talking). Namely, at the time of a speech sound, a plurality of mountain-shaped spectral shapes are produced in a spectrum and at the time of a no-speech sound, a flat-shaped spectrum is produced. Since the conventional code book for use in spectrum coding is created based on the spectrum characteristic of speech at the time of a speech sound, it is not adequate to use the code book as for a noise at the time of a no-speech sound having the different characteristic from the speech sound. If noise at the time of a no-speech sound is compulsorily referred to the code book, it is coded into noise having completely different characteristic from the inherent noise and decoding of such noise will produce an incongruous background noise.
A sense of incongruity in the background noise after decoding may be caused by referring to the code book created based on the sound characteristic at the time of a speech sound, as for noise at the time of a no-speech sound having a different sound characteristic from a speech sound. Even in the speech coding and decoding system based on the CELP (Code-book Excited Linear Prediction) method having a VOX function of discriminating between a speech sound and a no-speech sound of a transmitter, a sense of incongruity is similarly felt in the decoded background noise.