1. Field of the Invention
The present invention relates to a coding system such as a Slepian-Wolf coding system for use in, for example, distributed video coding, and to the encoding apparatus and decoding apparatus in the coding system.
2. Description of the Related Art
Distributed video coding (DVC) is a video compression method that has grown out of theoretical research by Slepian and Wolf and further work by Wyner, Ziv, and others. In one DVC method, the encoder carries out only intraframe coding, while the decoder carries out both intraframe and interframe decoding. This scheme greatly reduces the computational load on the encoder, for which reason distributed video coding has been attracting considerable attention.
An exemplary distributed coding system is outlined in FIG. 1, which is taken from Aaron et al., ‘Transform-Domain Wyner-Ziv Codec for Video’, Proc. SPIE Visual Communications and Image Processing, San Jose, Calif., 2004. In the encoder, a video image sequence is divided into key frames, to which conventional intraframe coding and decoding are applied, and so-called Wyner-Ziv frames, to which Slepian-Wolf coding and decoding processes are applied. In the encoding process, a discrete cosine transform (DCT) is used to transform each Wyner-Ziv frame to the coefficient domain, the coefficients are grouped into bands, the coefficients in the k-th band are quantized by a 2Mk-level quantizer, the quantized coefficients (qk) are expressed in fixed numbers of bits, and the bit planes are extracted and supplied to a turbo encoder that that produces information bits and error-correcting bits, called parity bits. The parity bits are stored in a buffer for transmission to the decoder. The information bits are conventionally discarded.
To decode a Wyner-Ziv frame, the decoder generates a predicted image by interpolation or extrapolation from one or more key frames, applies a DCT to convert the predicted image to the coefficient domain, groups the coefficients into bands, and inputs the coefficients in each band as side information to a turbo decoder. The turbo decoder requests the parity bits it needs to detect and correct errors in the side information. If necessary, further parity bits can be requested and the decoding process can be repeated until a satisfactory result is obtained. Alternatively, the transmission of parity bits may be controlled at the coder.
Finally, the decoded values and the side information are both used to reconstruct the coefficients of the Wyner-Ziv frame, and an inverse discrete cosine transform (IDCT) is carried out to recover the image.
FIG. 2 further illustrates the conventional Slepian-Wolf encoding and decoding processes in a slightly different form. The systematic encoder 11 in the encoding apparatus 10A may be any type of encoder that generates information bits and parity bits separately. A turbo encoder is one type of systematic encoder. The information bits are discarded; the parity bits are stored in a parity bit buffer 12. Some or all of the parity bits are sent to a parity bit transmitter 13 and transmitted to the decoding apparatus 10B at the command of a parity bit transmission controller 14.
In the decoding apparatus 10B, the transmitted parity bits are received by a parity bit receiver 15 and placed in a parity bit buffer 16, from which they are supplied to an error correcting decoder 17. An information bit predictor 18 supplies predicted information bits to the error correcting decoder 17. The error correcting decoder 17 uses the parity bits to carry out an error-correcting decoding process and outputs the resulting decoded bits.
The error correcting decoder 17 may use the maximum a-posteriori probability (MAP) decoding algorithm described by Sklar in Digital Communication: Fundamentals and Applications, Prentice-Hall, 2001. This algorithm, which is used in turbo coding and other coding methods, is a high-performance error-correcting decoding method in which the coder uses the parity bits and predicted information bits, which are predicted at the decoder, to calculate the probability that each information bit is 0 or 1.
The conventional coding and decoding operations are illustrated in FIGS. 3 and 4. The exemplary systematic encoder 11, represented schematically in FIG. 3, is a feedforward convolutional encoder with a constraint length of three and a coding rate of one-half that generates information bits (x) and parity bits (y). The corresponding decoding operation can be represented in a trellis diagram as in FIG. 4 with forward branch metric values α and backward branch metric values β.
The forward branch metric values α are calculated from left to right in FIG. 4. At an arbitrary time k, the encoder may be in one of four states (a, b, c, d) representing the two most recent input data bits. The forward branch metric for state a at time k=n, for example, is calculated from the probability that the xy bit values in FIG. 3 at time k=n were 00 and the value of α at state a at time k=n−1, and the probability that the xy bit values at time k=n were 01 and the value of α at state c at time k=n−1. The forward branch metrics for the other states (b, c, d) at time k=n, the branch metrics at time k=n+1, and so on are calculated similarly. These calculations proceed in sequence from left to right.
The backward branch metric values β are calculated from right to left in FIG. 4. The backward branch metric for state a at time k=n+1, for example, is calculated from the probability that the xy bit values at time k=n+1 were 00 and the value of β at state a at time k=n+2, and the probability that the xy bit values at time k=n+1 were 11 and the value of β at state b at time k=n+2. The backward branch metrics for the other states (b, c, d) at time k=n+1, the branch metrics at time k=n, and so on are calculated similarly. These calculations proceed in sequence from right to left.
After α and β have been obtained for all states (a, b, c, d) at all times k, these values are used for decoding as described by Sklar.
Since the MAP decoding method proceeds by calculating forward and backward branch metrics as above, its implementation requires that a known number of symbols be processed. In conventional implementations the number of symbols is fixed, and the decoder processes that number of symbols as a single independent unit.
In video coding, however, the image format is not fixed: various video formats are in general use, including the common intermediate format (CIF, 352×288 pixels) and the quarter common intermediate format (QCIF, 177×144 pixels). The number of symbols to be decoded in a CIF frame is four times the number of symbols to be decoded in a QCIF frame.
Another cause of changes in the number of symbols is that the decoder may have to switch between pixel-by-pixel processing and processing of eight-by-eight blocks of pixels. The direct current (DC) component of the DCT, for example, comprises a single value for an eight-by-eight pixel block, so even for the same image format, the number of symbols per frame may change at different stages of the decoding process.
It would be possible to design a multi-format MAP decoder with facilities for handling several different data formats with different numbers of symbols, but this scheme would require extra circuitry and would lack flexibility, as it would only be possible to decode data having one of the particular sizes anticipated by the design.
MAP decoders of the type shown by Sklar are often implemented by parallel processing, as disclosed by Viterbi et al. in ‘An Intuitive Justification and a Simplified Implementation of the MAP Decoder for Convolutional Codes’, IEEE Journal on Selected Areas in Communications, Vol. 16, No. 2, February 1998, but it is difficult to change the multiplicity of the parallel processing flexibly, and parallel processing reduces the error-correcting capability of the decoder.
There is a need for an encoder, a decoder, and a coding system that, while avoiding an increase in circuit size, can deal flexibly with different data formats and sizes, can make flexible changes in the multiplicity of parallel processing, and can provide improved error correcting capability.