Video signals, like speech or music, are encoded for example to enable an efficient transmission or storage of the video signals.
Advanced digital video compression algorithms are now being adopted for applications including HD-DVD, video conferencing, terrestrial and satellite broadcasting. State of the art codecs, such as MPEG-4 AVC offers significant improvements over previous standards, for example reducing the bit rate of an equivalent MPEG-2 bitstream by approximately 50%.
Future applications for encoding video signals may range from multimedia content delivery on mobile handsets to High Definition television broadcasting. To allow such diversity in video distribution, it will be necessary to have means of adapting the video signal to the capacities of the available channel and/or terminal.
Possible solutions include simulcasting, where multiple versions with differing coding rates and coding methods are broadcast or delivered over the same transmission media. Such approaches are wasteful in requiring a transmission channel with a much wider bandwidth than a single encoded signal bandwidth.
Another proposed solution is the use of scalable or embedded coding, where a common or core coding layer is supplemented by additional layers of enhanced coding, so that both low bandwidth and/or low ‘capacity’ terminals receive the common or core layer to produce a video signal with a first quality output and higher bandwidth channels and/or higher ‘capacity’ terminals receive both the common or core layer and at least one further enhanced layer of the coded signal to produce an improved quality output. However, these scalable or embedded coding systems have not been developed sufficiently for robust everyday usage—and standards relating to video scalable coding are generally considered unstabilized.
A third proposed solution is the use of transcoding. Trancoding is where a first high quality bitstream is received by the terminal but that the terminal is unable to process the bitstream so to produce a video image sequence. Numerous algorithms have been developed for the requantization (transrating) of video in the last decade. Some of these, such as the Cascaded Pixel Domain Transcoder (CPDT) and the Fast Pixel Domain Transcoder (FPDT), have been used successfully.
The CPDT architecture is built around cascading a decoder with an encoder. This produces significant complexity when implemented in the terminal and thus significantly increases the processing and memory requirements by the terminal.
The FPDT architecture builds on the CPDT architecture by using linearity assumptions to merge the decoder and encoder processes into a single decoder-encoder process loop. The merging of the decoder/encoder reduces the complexity and thus the processing and memory requirements of the CPDT architecture. However the inaccuracy of the FPDT assumptions significantly limit the application of the FPDT techniques because it can not fully support modification of residual information, coding modes, etc.
Furthermore, the quality of CPDT and FPDT approaches may produce significantly lower quality outputs when compared against a full decode and recode process.
The advanced coding methods, such as MPEG-4 AVC, derive their performance benefits from the availability of a rich set of coding modes and options. These include variable block size, variable resolution motion estimation, multiple reference frames and intra prediction. The compression efficiency of these codecs is highest only when all modes are used. For example, when requantizing an MPEG-4 AVC bitstream with CPDT, the encoding decisions of the incoming bitstream are generally retained to reduce the complexity. This implies that the transcoded video uses sub-optimal encoding parameters.