As a method of encoding an audio signal at a middle or a low bit rate in high efficiency, a method has been widely used which encodes the audio signal by separating it into a linear prediction (LP) filter and an excitation signal to drive the filter. As such a representative method, there is known Code Excited Linear Prediction (CELP) (e.g., see Nonpatent Document 1; M. R. Schroeder and B. S. Atal: “Code excited linear prediction: High quality speech at very low bit rates,” Proc. of IEEE Int. Conf. on Acoust., Speech and Signal Processing, pp. 937-940, 1985).
The CELP is a method of obtaining a synthesized audio signal by driving an LP filter provided with an LP coefficient indicative of frequency characteristics of an input audio, by an excitation signal represented by a sum of an adaptive codebook (ACB) indicative of a pitch cycle of the input audio and a fixed codebook (FVB) composed of a random number or a pulse. The ACB and FCB components are multiplied by gains (ACB and FCB gains), respectively.
For example, assuming a mutual connection between a 3G mobile network and a cable packet network, it is to be noted that standard audio encoding methods used for both of the networks are different from each other and this brings about a problem of difficulty of a direct connection to occur between the 3G mobile network and the cable packet network difficult. As a solution to this problem, a tandem connection has been developed.
Now, referring to FIG. 5, there is shown an example of a configuration of a code conversion device for converting a code (first code string or sequence) obtained by encoding an audio by using a first audio encoding method (method 1) into a code (second code string or sequence) decodable by a second method (method 2). The conventional code conversion device based on the tandem connection will be described more specifically by referring to FIG. 5. Audio encoding and decoding methods are disclosed in the Nonpatent Document 1 or 3GPP Specification (3rd generation Party: Technical Specification) or the like (Nonpatent Document 2: “AMR speech code; Transcoding functions” 3GPP TS 26.090 Chapter 4). Description will be made on the presumption that a code string is input/output at a frame period (e.g., period of 20 milliseconds) which is a processing unit of audio encoding/decoding.
An audio decoding device 1A shown in FIG. 5 is operated to decode an audio signal or a non-audio signal, such as noise, in response to a first code string input through an input terminal 3 by a first decoding method corresponding to a first encoding method, and to output the decoded signal as a first decoded signal to both an audio encoding device 2A and an audio detection device 5.
The audio detection device 5 receives the first decoded signal output from the audio decoding device 1A, judges whether the first decoded signal specifies an audio section or a non-audio section, and outputs an audio detection result flag to the audio encoding device 2A on the basis of a result of the judgment. An audio detection method is described in detail in the 3GPP Specification or the like. Thus, it is not described in detail here (Nonpatent Document 3 “AMR speech code; Voice Activity Detector (VAD)” 3GPP TS 26.094 Chapter 3).
The audio encoding device 2A is operable in response to the first decoded signal output from the audio decoding device 1A and the audio detection result flag output from the audio detection device 5. From the audio detection result flag, judgment can be made as to whether the first decoded signal specifies an audio section or a non-audio section. Responsive to the audio detection result flag, the audio encoding device 2A outputs a code string obtained by encoding an audio signal or a non-audio signal by a second encoding method to produce a second code string through an output terminal 4. The description of FIG. 5 has been completed so far.
Details on header and frame type information input to the audio decoding device 1A have been known (Nonpatent Document 4: “AMR speech codec; frame structure” 3GPP TS 26.101 Chapter 4). Additionally, methods described below for encoding and decoding noise have been known (Nonpatent Document 5: “AMR speech codec; comfort noise aspects” 3GPP TS 26.092 Chapters 5 and 6).
As mentioned above, the aforementioned conventional code conversion device uses the audio detection device to judge whether the signal decoded from the first code string specifies the audio section or the non-audio section. Therefore, such inclusion of the audio detection device causes a problem to occur in that the code conversion device inevitably becomes large in size. In other words, the Nonpatent Documents 1 to 5 have no mention at all of a possibility of improvement of the code conversion device shown in FIG. 5.