The present invention relates generally to processing telecommunication signals. More particularly, the invention relates to a method and apparatus for improving the output signal quality of a transcoder that translates digital packets from one compression format to another compression format. Merely by way of example, the invention has been applied to voice transcoding between Code-Excited Linear Prediction (CELP) codecs, but it would be recognized that the invention has a much broader range of applicability. To this end, the class of applicable codecs is designated as being “common” codecs.
The process of converting from one voice compression format to another voice compression format can be performed using various techniques. The tandem coding approach is to fully decode the compressed signal back to a Pulse-Code Modulation (PCM) representation and then re-encode the signal. This requires a large amount of processing and incurs increased delays. More efficient approaches include transcoding methods where the compressed parameters are converted from one compression format to the other while remaining in the parameter space.
Many of the current standardized low bit rate speech coders are based on the Code-Excited Linear Prediction (CELP) model. Common parameters of a CELP coder are the linear prediction parameters, adaptive codebook lag and gain parameters, and fixed codebook index and gain parameters.
The similarities between CELP-based codecs allow one to take advantage of the processing redundancies inherent in them. FIG. 1 shows a block diagram for a typical prior art CELP decoder. The decoder receives as input a bitstream consisting of several parameters, commonly representing the fixed codebook index, fixed codebook gain, adaptive codebook gain, adaptive codebook (pitch) lag and the linear prediction (LP) parameters. The decoder constructs the fixed codeword, which is then scaled by the codebook gain. The adaptive codeword, which is a previous excitation segment that has been delayed by the pitch lag and scaled by the adaptive gain, is added to the fixed codebook contribution. The resulting excitation signal is then filtered by a short term predictor producing synthesized speech. This speech is then post-filtered in order to reduce the perceptual significance of any synthesis artifacts and improve speech quality.
FIG. 2 shows a block diagram for a typical prior art CELP encoder. The incoming speech signal is first pre-processed, for example, high-pass filtered to get rid of any superfluous information such as very low frequency information. Next, the spectral shape information is extracted by linear prediction (LP) analysis. The LP parameters are often represented as Line Spectral Pairs (LSPs) and quantized. The speech signal is then filtered using the inverse LP synthesis filter to remove the spectral envelope contribution and produce the excitation signal. Both the pre-processed speech and excitation are filtered with a perceptual weighting filter. The perceptually weighted speech is analyzed for periodicity, often using both a open loop pitch lag search and a closed loop (analysis-by-synthesis) pitch lag and pitch gain search. The pitch contribution is subtracted from the perceptually weighted speech to create a target signal for the fixed codebook search. The fixed codebook search consists of an analysis-by-synthesis algorithm, in which various code words are evaluated to minimize the error between the synthesized codeword and target signal.
Transcoding addresses the problem that occurs when two incompatible standard coders need to interoperate. The conventional prior art tandem coding solution, illustrated in FIG. 3, is to fully decode the signal from one compression format to PCM, and then to re-encode the PCM signal using the other compression format. This solution has the disadvantages of being computationally complex, it and introduces quality degradations due to the full decode and full encode. Alternatively a prior art transcoder, as shown in FIG. 4, may be used which converts the bitstream from one compression format to a different compression format without fully decoding to PCM and then re-encoding the signal.
Some transcoding approaches involve converting parameters solely in the CELP domain. These methods have the advantage of reducing computational complexity. FIG. 5 shows an example of one prior art transcoding approach in which the source codec LSPs are directly translated and quantized to the destination codec format. The speech is then synthesized using the destination codec LSPs and the remaining CELP parameters are found using a searching algorithm. This technique does not improve the quality of the transcoded signal to the fullest extent and is not necessarily the best solution in some situations.
While smart transcoding techniques that map parameters from one CELP format to another in a fast manner have been developed, a transcoding solution that provides transcoded speech of a higher quality than the conventional tandem coding solution and that may be configured and tuned for specific source and destination codec pairs is highly desirable.