To offer mobility and continuity, modern and innovative multimedia communication services must be able to function under a wide variety of conditions. The dynamism of the multimedia communication sector and the heterogeneous nature of networks, access points, and terminals have generated a proliferation of compression formats.
The present invention relates to optimization of the “multiple coding” techniques used when a digital signal or a portion of a digital signal is coded using more than one coding technique. The multiple coding may be simultaneous (effected in a single pass) or non-simultaneous. The processing may be applied to the same signal or to different versions derived from the same signal (for example with different bandwidths). Thus, “multiple coding” is distinguished from “transcoding”, in which each coder compresses a version derived from decoding the signal compressed by the preceding coder.
One example of multiple coding is coding the same content in more than one format and then transmitting it to terminals that do not support the same coding formats. In the case of real-time broadcasting, the processing must be effected simultaneously. In the case of access to a database, the coding could be effected one by one, and “offline”. In these examples, multiple coding is used to code the same signal with different formats using a plurality of coders (or possibly a plurality of bit rates or a plurality of modes of the same coder), each coder operating independently of the others.
Another use of multiple coding is encountered in coding structures in which a plurality of coders compete to code a signal segment, only one of the coders being finally selected to code that segment. That coder may be selected after processing the segment, or even later (delayed decision). This type of structure is referred to below as a “multimode coding” structure (referring to the selection of a coding “mode”). In these multimode coding structures, a plurality of coders sharing a “common past” code the same signal portion. The coding techniques used may be different or derived from a single coding structure. They will not be totally independent, however, except in the case of “memoryless” techniques. In the (routine) situation of coding techniques using recursive processing, the processing of a given signal segment depends on how the signal has been coded in the past. There is therefore some coder interdependency, when a coder has to take account in its memories of the output from another coder.
The concept of “multiple coding” and conditions for using such techniques have been introduced in the various contexts referred to above. The complexity of implementation may prove insurmountable, however.
For example, in the situation of content servers that broadcast the same content with different formats adapted to the access conditions, networks, and terminals of different clients, this operation becomes extremely complex as the number of formats required increases. In the case of real-time broadcasting, as the various formats are coded in parallel, a limitation is rapidly imposed by the resources of the system.
The second use referred to above relates to multimode coding applications that select one coder from a set of coders for each signal portion analyzed. Selection requires the definition of a criterion, the more usual criteria aiming to optimize the bit rate/distortion trade-off. The signal being analyzed over successive time segments, a plurality of codings are evaluated in each segment. The coding with the lowest bit rate for a given quality or the best quality for a given bit rate is then selected. Note that constraints other than those of bit rate and distortion may be used.
In such structures, the coding is generally selected a priori by analyzing the signal over the segment concerned (selection according to the characteristics of the signal). However, the difficulty of producing a robust classification of the signal for the purposes of this selection has led to the proposal for a posteriori selection of the optimum mode after coding all the modes, although this is achieved at the cost of high complexity.
Intermediate methods combining the above two approaches have been proposed with a view to reducing the computation cost. Such strategies are less than the optimum, however, and offer worse performance than exploring all the modes. Exploring all the modes or a major portion of the modes constitutes a multiple coding application that is potentially highly complex and not readily compatible a priori with real-time coding, for example.
At present, most multiple coding and transcoding operations take no account of interaction between formats and between the format and its content. A few multimode coding techniques have been proposed but the decision as to the mode to use is generally effected a priori, either on the signal (by classification, as in the SMV coder (selectable mode vocoder), for example, or as a function of the conditions of the network (as in adaptive multirate (AMR) coders, for example).
Various selection modes are described in the following documents, in particular decision controlled by the source and decision controlled by the network:
“An overview of variable rate speech coding for cellular networks”, Gersho, A.; Paksoy, E.; Wireless Communications, 1992. Conference Proceedings, 1992 IEEE International Conference on Selected Topics, 25-26 Jun. 1992 Page(s): 172-175;
“A variable rate speech coding algorithm for cellular networks”, Paksoy, E.; Gersho, A.; Speech Coding for Telecommunications, 1993. Proceedings, IEEE Workshop 1993, Page(s): 109-110; and
“Variable rate speech coding for multiple access wireless networks”, Paksoy E.; Gersho A.; Proceedings, 7th Mediterranean Electrotechnical Conference, 12-14 Apr. 1994 Page(s): 47-50 vol. 1.
In the case of a decision controlled by the source, the a priori decision is made on the basis of a classification of the input signal. There are many methods of classifying the input signal.
In the case of a decision controlled by the network, it is simpler to provide a multimode coder whose bit rate is selected by an external module rather than by the source. The simplest method is to produce a family of coders each of fixed bit rate but with different coders having different bit rates and to switch between those bit rates to obtain a required current mode.
Work has also been done on combining a plurality of criteria for a priori selection of the mode to be used; see in particular the following documents:
“Variable-rate for the basic speech service in UMTS” Berruto, E.; Sereno, D.; Vehicular Technology Conference, 1993 IEEE 43rd, 18-20 May 1993 Page(s): 520-523; and
“A VR-CELP codec implementation for CDMA mobile communications” Cellario, L.; Sereno, D.; Giani, M.; Blocher, P.; Hellwig, K.; Acoustics, Speech, and Signal Processing, 1994, ICASSP-94, 1994 IEEE International Conference, Volume: 1, 19-22 Apr. 1994 Page(s): I/281-I/284 vol. 1.
All multimode coding algorithms using a priori coding mode selection suffer from the same drawback, related in particular to problems with the robustness of a priori classification.
For this reason techniques have been proposed using an a posteriori decision as to the coding mode. For example, in the following document:
“Finite state CELP for variable rate speech coding” Vaseghi, S. V.; Acoustics, Speech, and Signal Processing, 1990, ICASSP-90, 1990 International Conference, 3-6 Apr. 1990 Page(s): 37-40 vol. 1,
the coder can switch between different modes by optimizing an objective quality measurement with the result that the decision is made a posteriori as a function of the characteristics of the input signal, the target signal-to-quantization noise ratio (SQNR), and the current status of the coder. A coding scheme of this kind improves quality. However, the different codings are carried out in parallel and the resulting complexity of this type of system is therefore prohibitive.
Other techniques have been proposed combining an a priori decision and closed loop improvement. In the document:
“Multimode variable bit rate speech coding: an efficient paradigm for high-quality low-rate representation of speech signal” Das, A.; DeJaco, A.; Manjunath, S.; Ananthapadmanabhan, A.; Huang, J.; Choy, E.; Acoustics, Speech, and Signal Processing, 1999. ICASSP '99 Proceedings, 1999 IEEE International Conference, Volume: 4, 15-19 Mar. 1999 Page(s): 2307-2310 vol. 4,
the proposed system effects a first selection (open loop selection) of the mode as a function of the characteristics of the signal. This decision may be effected by classification. Then, if the performance of the selected mode is not satisfactory, on the basis of an error measurement, a higher bit rate mode is applied and the operation is repeated (closed loop decision).
Similar techniques are described in the following documents:                “Variable rate speech coding for UMTS” Cellario, L.; Sereno, D.; Speech Coding for Telecommunications, 1993. Proceedings, IEEE Workshop, 1993 Page(s): 1-2.        “Phonetically-based vector excitation coding of speech at 3.6 kbps” Wang, S.; Gersho, A.; Acoustics, Speech, and Signal Processing, 1989. ICASSP-89, 1989 International Conference, 23-26 May 1989 Page(s): 49-52 vol. 1.        “A modified CS-ACELP algorithm for variable-rate speech coding robust in noisy environments” Beritelli, F.; IEEE Signal Processing Letters, Volume: 6 Issue: 2, Feb. 1999 Page(s): 31-34.        
An open loop first selection is effected after classification of the input signal (phonetic or voiced/non-voiced classification), after which a closed loop decision is made:                either over the complete coder, in which case the whole speech segment is coded again;        or over a portion of the coding, as in the above references preceded by an asterisk (*), in which case the dictionary to be used is selected by a closed loop process.        
All of the work referred to above seeks to solve the problem of the complexity of the optimum mode selection by the total or partial use of an a priori selection or preselection that avoids multiple coding or reduces the number of coders to be used in parallel.
However, no prior art technique has ever been proposed that reduces coding complexity.