1. Field of the Invention
The present invention relates to a coding apparatus for coding speech signals or the likes at a high efficiency, and particularly, to a coding apparatus suitable for variable rate coding.
2. Description of the Related Art
Coding of speech signals at a high efficiency and a low bit rate is an important technique for effective use of electric waves and reduction communication costs in the field of communication using movable devices such as car telephones and the likes and domestic communication in a company. In recent years, a variable rate communication system using a code division multiple access (CDMA) method has been planned in the United States of America, and expects for multiple channels and high quality services which make the best use of the characteristics of a variable rate have increased. In addition, the variable rate speech coding is a method which realizes effective use of stored media, since effective bit distribution can be achieved by variable rate speech coding, from view points of application of stored systems, in accordance with the characteristics of speech. On this background, studies and developments in the variable rate speech coding have been actively made.
With respect to a fixed rate, a CELP (Code Excited Linear Prediction) method has been known as a speech coding scheme capable of high quality speech synthesis at a bit rate of 8 kbps or less. However, the CELP method is a main trend in the field of a variable rate. In this case, among a plurality of types, e.g., four types of coding bit rates, one bit rate is selected for every fixed frame length, and coding is performed by the CELP method optimized to comply with the selected bit rate. In addition, where the coding bit rate is as low as 1 kbps, a vocoder system using a random noise scheme for a drive signal is adopted in some cases, and generally, a different coding scheme is used for every one bit rate. In variable rate coding, the superiority of the method is decided, depending on how the average bit rate can be decreased, while achieving target quality, and therefore, a method for selecting a coding scheme for every frame is significant. With respect to this demand, following two methods have been proposed in prior art techniques.
As a first method, for example, there is a QCELP method by A. Dejaco et al (reference 1: "QCELP: The North American CDMA Digital Celtular Variable Rate Speech Coding Standard", Proc. of the IEEE Workshop on Speech Coding for Telecommunications, PP5, 6, Oct., 1993). This method adopts a system in which a frame power is extracted as a characteristic amount, and an encoder is selected on the basis of the characteristic amount. In addition, a VRPS method by E. Paksoy et al (reference 2: "Variable Rate Speech Coding with Phonetic Segmentation", Proc. ICASSP 93, PPI I-155 158, April 1993) adopts a system in which an encoder is selected on the basis of the weighting sum value of seven characteristic amounts including a low frequency speech energy, a zero-cross ratio, and the likes.
Although the coding system select methods as described above attain a merit that the methods can be realized by relatively less calculation amounts, decoded speech does not always achieve target quality defined by SNR or the like, but sometimes results in low quality. Further, on condition that background noise is added to an input signal, extraction of characteristic amounts cannot be properly carried out, so that proper selection results are not sometimes appropriate. This sometimes leads to deterioration in quality of synthesized voices.
As a second method, there is an FS-CELP (Finite State-CELP) method (reference 3: "Finite State CELP for variable rate speech coding", IEE Proc.-I, vol. 138, No. 6, PP603-610, Dec. 1991).
Although the encoder select method of this reference attains a merit in that an encoder is selected such that target quality is achieved, all the encoders previously prepared must be carried out, so that there is a problem in that the calculation amount is extremely large.
In addition, a hybrid method combining the first and second methods as described above is reported by L. Cellario et al. (reference 4: "Variable Rate Speech Coding for UMTS", Proc. of the IEEE Workshop on Speech Coding for Telecommunications, PPI-2, Oct. 1993). In this hybrid method, firstly, encoders are restricted by using characteristic amounts obtained by analyzing an input voice, and secondly, the encoders thus limited respectively perform coding, thereby to finally select an encoder which minimizes the cost function. Although an intermediate solution between the first and second methods can be obtained in this method, a plurality of encoders must be operated, and therefore, there remains a problem in that the calculation amounts become large.
As has been described above, in the one of the conventional methods in which an input signal is analyzed to extract a characteristic amount and an encoder is selected in accordance with the characteristic amount, a decoded voice does not always attain target quality and sometimes results in degradation in quality. In case where an input signal is added with background noise, extraction of characteristic amounts cannot be properly achieved, so that a proper encoder cannot be selected, thereby resulting in degradation in quality of synthesized voices. The other method in which all the prepared encoders are used to perform coding to select the encoder which minimizes the cost function and the hybrid method combining the former two methods led to a problem that the calculation amount is extremely large.
In addition, in conventional CELP coding, if the quantization bit rate is decreased, the number of quantization bits is decreased, making it difficult to express changes in pitch period and pitch waveform. In addition, since pitch information is greatly damaged in a coding step, the degree of recovery of the pitch information is limited even if recovery processing of pitch information is performed with use of a post filter in the decoding side.
Further, if coded data transferred with a transfer path code added is directly stored or transferred without changes, redundant bits relating to a transfer path code completely unnecessary for storing or transferring of the data are stored or transferred together, so that there is a problem that efficiency in use of a storing apparatus or a transfer path is decreased.
Furthermore, there is a problem that compression coding data which is unnecessary for transfer or storage is stored, depending on the method of compression coding of data and the specifications of a reproducing apparatus, and therefore, efficiencies in use of a recording medium and a transfer path are decreased.
Further, unnecessary coding data such as transfer path codes and compression codes as described above is decoded for every reproduction of data, the circuit scale of a reproducing apparatus and power consumption is increased.