Not Applicable
Not Applicable
The present invention generally relates to techniques for processing information. More particularly, the invention provides a method and apparatus for converting CELP frames from one CELP based standard to another CELP based standard, and/or within a single standard but a different mode. Further details of the present invention are provided throughout the present specification and more particularly below.
Coding is the process of converting a raw signal (voice, image, video, etc) into a format amenable for transmission or storage. The coding usually results in a large amount of compression, but generally involves significant signal processing to achieve. The outcome of the coding is a bitstream (sequence of frames) of encoded parameters according to a given compression format. The compression is achieved by removing statistically and perceptually redundant information using various techniques for modeling the signal. Hence the encoded format is referred to as a xe2x80x9ccompression formatxe2x80x9d or xe2x80x9cparameter spacexe2x80x9d. The decoder takes the compressed bitstream and regenerates the original signal. In the case of speech coding, compression typically leads to information loss.
The process of converting between different compression formats and/or reducing the bit rate of a previously encoded signal is known as transcoding. This may be done to conserve bandwidth, or connect incompatible clients and/or server devices. Transcoding differs from the direct compression process in that a transcoder only has access to the compressed signal and does not have access to the original signal.
Transcoding can be done using brute force techniques such as xe2x80x9ctandemxe2x80x9d which has a decompression process followed by a re-compression process. Since large amount of processing is often required and delays may be incurred to decompress and then re-compress a signal, one can consider transcoding in the compression space or parameter space. Such transcoding aims at mapping between compression formats while remaining in the parameter space wherever possible. This is where the sophisticated algorithms of xe2x80x9csmartxe2x80x9d transcoding come into play. Although there has been advances in transcoding, it is desirable to further improve transcoding techniques. Further details of limitations of conventional techniques will be described more fully throughout the present specification and more particularly below.
According to a the present invention, techniques for processing information are provided. More particularly, the invention provides a method and apparatus for converting CELP frames from one CELP based standard to another CELP based standard, and/or within a single standard but a different mode. Further details of the present invention are provided throughout the present specification and more particularly below.
In a specific embodiment, the invention provides an apparatus for converting CELP frames from one CELP-based standard to another CELP based standard, and/or within a single standard but to a different mode. The apparatus has a bitstream unpacking module for extracting one or more CELP parameters from a source codec. The apparatus also has an interpolator module coupled to the bitstream unpacking module. The interpolator module is adapted to interpolate between different frame sizes, subframe sizes, and/or sampling rates of the source codec and a destination codec. A mapping module is coupled to the interpolator module. The mapping module is adapted to map the one or more CELP parameters from the source codec to one or more CELP parameters of the destination codec. The apparatus has a destination bitstream packing module coupled to the mapping module. The destination bitstream packing module is adapted to construct at least one destination output CELP frame based upon at least the one or more CELP parameters from the destination codec. A controller is coupled to at least the destination bitstream packing module, the mapping module, the interpolator module, and the bitstream unpacking module. Preferably, the controller is adapted to oversee operation of one or more of the modules and being adapted to receive instructions from one or more external applications. The controller is adapted to provide a status information to one or more of the external applications.
In an alternative specific embodiment, the invention provides a method for transcoding a CELP based compressed voice bitstream from source codec to destination codec. The method includes processing a source codec input CELP bitstream to unpack at least one or more CELP parameters from the input CELP bitstream and interpolating one or more of the plurality of unpacked CELP parameters from a source codec format to a destination codec format if a difference of one or more of a plurality of destination codec parameters including a frame size, a subframe size, and/or sampling rate of the destination codec format and one or more of a plurality of source codec parameters including a frame size, a subframe size, or sampling rate of the source codec format exist. The method includes encoding the one or more CELP parameters for the destination codec and processing a destination CELP bitstream by at least packing the one or more CELP parameters for the destination codec.
In an alternative specific embodiment, the invention provides a method for processing CELP based compressed voice bitstreams from source codec to destination codec formats. The method includes transferring a control signal from a plurality of control signals from an application process and selecting one CELP mapping strategy from a plurality of different CELP mapping strategies based upon at least the control signal from the application. The method also includes performing a mapping process using the selected CELP mapping strategies to map one or more CELP parameters from a source codec format to one or more CELP parameters of a destination codec format.
Still further, the invention provides a system for processing CELP based compressed voice bitstreams from source codec to destination codec formats. The system includes one or more memories. Such memories may include one or more codes for receiving a control signal from a plurality of control signals from an application process. One or more codes for selecting one CELP mapping strategy from a plurality of different CELP mapping strategies based upon at least the control signal from the application are also included. The one or more memories also include one or more codes for performing a mapping process using the selected CELP mapping strategies to map one or more CELP parameters from a source codec format to one or more CELP parameters of a destination codec format. Depending upon the embodiment, there may also be other computer codes for carrying out the functionality described herein, as well as outside of this specification, which may be combined with the present invention.
Numerous benefits are achieved using the present invention. Depending upon the embodiment, one or more of these benefits may be achieved.
To reduce the computational complexity of the transcoding process.
To reduce the delay through the transcoding process.
To reduce the amount of memory required by the transcoding.
To introduce dynamic rate control
To support silence frames through an embedded voice activity detector.
To provide a framework where various parameter mapping strategies can be used.
To provide a generic transcoding architecture to adapt the current and future diversity CELP based codecs.
The transcoding invention may achieve one or more of these benefits. In a specific embodiment, the transcoding apparatus includes:
a source CELP parameter unpacking module that extracts CELP parameters from the input encoded CELP bitstream;
a CELP parameter interpolator that converts the input source CELP parameters into destination CELP parameters corresponding to the subframe size difference between source and destination codec; Parameter interpolation is used if the subframe size of source and destination codecs are different.
a destination CELP parameter mapping and tuning engine that converts CELP parameters from the said interpolator module into the destination CELP codec parameters;
a destination CELP codes packer that packs the mapped CELP parameters into destination CELP code frames;
an advanced feature manager that manages optional functions and features in CELP-to-CELP transcoding;
a controller that oversees the overall transcoding process;
a status reporting function that provides the status of the transcoding process.
The source CELP parameter unpacking module is a simplified CELP decoder without a formant filter and a post-filter.
The CELP parameter interpolator comprises of a set of interpolators related to one or more of the CELP parameters.
The destination CELP parameter mapping and tuning module includes a parameter mapping strategy switching module, and one or more of the following parameter mapping strategies: a module of CELP parameter direct space mapping, a module of analysis in excitation space mapping, a module of analysis in filtered excitation space mapping.
The invention performs transcoding on a subframe by subframe basis. That is, as a frame (of source compressed information) is received by the transcoding system, the transcoder can begin operating on it and producing output subframes. Once a sufficient number of subframes have been produced, a frame (of compressed information according to destination format) can be generated and can be sent to the communication channel if communication is the purpose. If storage is the purpose, the generated frame can be stored as desired. If the duration of the frames defined by the source and destination format standards are the same, then a single incoming frame will produce a single outgoing frame, otherwise buffering of either input frames, or generation of multiple output frames will be needed. If the subframes are of different durations, then interpolation between the subframe parameters will be required. Thus the transcoding operation consists of four operations: (1) bitstream unpacking, (2) subframe buffering and interpolation of source CELP parameters, (3) mapping and tuning to destination CELP parameters, and (4) code packing to produce output frame(s).
So on receipt of a frame, the transcoders unpack the bitstream to produce the CELP parameters for each of the subframes contained within the frame (FIG. 10, block (1)). The parameters of interest are the LPC coefficients, the excitation (produced from the adaptive and fixed codewords), and the pitch lag. Note that for a low complexity solution that produces good quality, only decoding to the excitation is required and not full synthesis of the speech waveform. If subframe interpolation is needed, it is done at this point by smart interpolation engine (FIG. 10, block (2)).
The subframes are now in a form amenable for processing by the destination parameter mapping and tuning module (FIG. 10, block (5)). The short-term LPC filter coefficients are mapped independently of the excitation CELP parameters. Simple linear mapping in the LSP pseudo-frequency space can be used to produce the LSP coefficients for the destination codec. The excitation CELP parameters can be mapped in a number of ways giving accordingly better quality output at the cost of computational complexity. Three such mapping strategies have been described in this document and are part of the Parameter Mapping and Tuning Strategies module (FIG. 10, block (4)):
CELP parameter Direct Space Mapping (DSM);
Analysis in excitation space domain;
Analysis in filtered excitation space domain
The selection of the mapping and tuning strategy is through the Mapping and Tuning Strategy Switching Module (FIG. 10, block (3)).
Since the three methods trade-off quality for reduced computational load, they can be used to provide graceful degradation in quality in the case of the apparatus being overloaded by a large number of simultaneous channels. Thus the performance of the transcoders can adapt the available resources. Alternatively a transcoding system may be built using one strategy only yielding a desired quality and performance. In such a case, the Mapping and Tuning Strategy Switching module (FIG. 10, Block (3)) would not be incorporated.
A voice activity detector (operating in the parameter space) can also be employed at this point, if applicable to the destination standard, to reduce the outbound bandwidth.
The mapped parameters can then be packed into destination bitstream format frames (FIG. 10, block (7)) and generated for transmission or storage.
The invention covers the algorithms and methods used to perform smart transcoding between CELP-based speech coding standards. The invention also covers transcoding within a single standard in order to perform rate control (by transcoding to lower modes or introduce silence frames through an embedded Voice Activity Detector).
The whole procedure of transcoding is overseen by a Control module (FIG. 10, block (8)) which sends command based on the status of transcoding and external instructions.
In order to adapt different transcoding requirements, the apparatus of the present invention provides the capabilities of adding optional features and functions (FIG. 10, block (6)).
Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawing, in which like reference characters designate the same or similar parts throughout the figures thereof.