The present application relates to information encoding for transmission over noisy channels and/or storage, and more particularly to error resilient coding.
The current rapid expansion of digital communication (speech, video, data) relies on increasingly economical digital signal processing. For example, video communication has general functionality as illustrated in FIG. 8a, and increasingly includes a link through the air interface (FIG. 8b) which can introduce noise and bit errors to the digital signal. Attempts to mitigate bit errors include the use of reversible codes as described in the following.
Commonly used video compression methods (e.g., MPEG) have block-based motion compensation to remove temporal redundancy (code only (macro)block motion vectors plus the corresponding quantized DCT residuals (texture) as in FIG. 8c) and use variable length coding (VLC) to increase coding efficiency. However, variable length coding often is highly susceptible to transmission channel errors, and a decoder easily loses synchronization with the encoder when uncorrectable errors arise. Further, the predictive nature of motion compensation makes matters much worse because the uncorrectable errors in one video frame quickly propagate across the entire video sequence and rapidly degrade the decoded video quality.
The typical approach to such uncorrectable errors includes the steps of: error detection (e.g., out-of-range motion vectors, invalid VLC table entry, or invalid number of residuals in a block), resynchronization of the decoder with the encoder, and error concealment by repetition of previously transmitted correct data in place of the uncorrectable data. For example, video compressed using MPEG1-2 has a resynchronization marker (start code) at the start of each slice of macroblocks (MBs) of a frame, and an uncorrectable error results in all of the data between correctly decoded resynchronization markers being discarded. This implies degradation in quality of the video stream.
VLC tables prove to be particularly sensitive to bit errors because bit errors can make one codeword be incorrectly interpreted to be another codeword of a different length, and the error is not detected. This makes the decoder lose synchronization with the encoder. Although the error may finally be detected due to an invalid VLC table entry, usually the location in the bitstream where the error is detected is not the same as the location where the error occurred. Hence, when the decoder detects an error, it has to seek the next resynchronization marker and discard all the data between this and the previous resynchronization marker. Thus, even a single bit error can sometimes result in a loss of a significant amount of data, and this is a problem of the known coding schemes.
Enhanced error concealment properties for motion compensated compression, such as MPEG, can be achieved by using data partitioning. Consider a xe2x80x9cvideo packetxe2x80x9d to consist of the data between two consecutive resynchronization markers. In a data partitioning approach, the motion data and the texture (DCT) data within each of the video packets are separately encoded in the bitstream. Another resynchronization word (Motion Resync. Word) is imbedded between the motion data and the DCT data to signal the end of the motion data and the beginning of the DCT data. This data partitioning allows the decoder to use the motion data even if the DCT data is corrupted by undetectable errors. This provides advantages including partial recovery over uncorrectable error in a packet of compressed video data with little additional overhead. The error concealment that is made possible by the use of motion compensation by applying decoded motion vectors results in a much better decoded video quality. And this extends to video packets for intra-coded frames in that the DCT dc coefficients can be separated from the other, less important texture data (DCT ac coefficients) by a DC resynchronization word.
When using data partitioning the data within the video packet is organized to look as shown in FIGS. 6a-c: FIG. 6a shows the fields between two resynchronization markers and FIGS. 6b-c illustrate the motion data field and the texture data field in more detail by an example. In particular, the first field (xe2x80x9cResynch Markerxe2x80x9d) is a resynchronization marker, the second field (xe2x80x9cMB No.xe2x80x9d) is the the number in the frame of the first macroblock (16xc3x9716 block of pixels) in the video packet, the third field (xe2x80x9cQPxe2x80x9d) is the default quantization parameter used to quantize the texture data (DCT coefficients) in the video packet, the fourth field (xe2x80x9cMotion Dataxe2x80x9d) is the motion data, the fifth field (xe2x80x9cMotion Resynch Wordxe2x80x9d) is the resynchronization marker between the motion data and the texture data, the sixth field (xe2x80x9cDCT Dataxe2x80x9d) is the texture data, and the last field (xe2x80x9cResynch Markerxe2x80x9d) is the ending resynchronization marker.
FIG. 6b shows the motion data field consisting of a COD field, an MCBPC field, and an MV field for each of the macroblocks in the packet. The COD field indicates whether the macroblock is coded or skipped (COD=0 macroblock is coded, COD=1 macroblock is skipped). The MCBPC field indicates (1) the mode of the macroblock and (2) which of the chrominance blocks in the macroblock are coded and which are skipped: the mode indicates whether the current macroblock is coded INTRA (no motion compensation), INTER (motion compensated with one 16xc3x9716 motion vector), or INTER4V (motion compensated with four 8xc3x978 motion vectors). Of course, if COD indicates the macroblock is not coded, then the MCBPC field is not present. The MV field is the actual motion vector data; either one vector or four vectors. Again, if COD indicates that the macroblock is not coded, then the MV field is not present. FIG. 6c shows the texture (DCT Data) field as consisting of a CBPY field and a DQUANT field for each of the macroblocks followed by the DCT data for each of the macroblocks. The CBPY field indicates which of the luminance blocks of the macroblock are coded and which are skipped. The DQUANT field indicates the differential increment to the default quantizer value (QP) to compute the quantization value for the macroblock. The DCT fields are run-length-encoded quantized DCT coefficient values of the macroblock.
MPEG-4 has three kinds of VLCs to encode the DCT coefficients: Table B-16 for encoding INTRA macroblocks, Table B-17 for coding INTER macroblocks, and Table B-23 which is used for coding macroblocks if reversible variable length codes (RVLC) are used. In contrast, H.263 uses only one table for encoding both the INTER and INTRA MBs: Table 13/H263. Table 13/H263 is identical to Table B-17 of MPEG-4.
Decoding normal VLCs (Table B-16/B-17 of MPEG-4 and Table 13/H263 of H.263) is done using identical techniques, thus consider the decoding of VLCs from Table 13/H263 of H.263. The length of VLC codewords in Table 13/H263 varies from 3 to 13 bits. The last bit is always a sign bit and is not used in variable length decoding, so the number of decodable bits varies up to 12. Variable length decoding is typically carried out in most of the standard decoders by using two different tables: DCT3DtabXval which contains the entries of Table 13/H263 and DCT3DtabXlen which contains the length of the corresponding codewords. Since the length of the VLC codeword is not known in advance, the fastest way to decode a VLC would be by using 212 (4096) entries in DCT3DtabXval. If 212 elements are used in DCT3DtabXval, then 12 bits from the bitstream can be read and be directly used to index into DCT3DtabXval to obtain the decoded values. The same 12 bits are indexed into DCT3DtabXlen to obtain the length of the codeword. The initial bits in the bitstream corresponding to the length of the codeword are then discarded and the process is repeated on the remaining bits plus the next bits in the bitstream up to 12 bits. Note that there are only 102 entries in Table 13/H263. Hence the DCT3DtabXval table and the DCT3DtabXlen table in sequential memory with index addresses would contain many duplicate entries. To conserve on memory, one may split DCT3DtabXval/DCT3DtabXlen based on the number of leading zeros present in the 12 bits, but this increases index complexity.
Reversible variable length codes (RVLC) are designed such that they can be decoded in either the forward or backward direction; see FIGS. 1-4 illustrating error detection possibilities. These codes are useful when the data may be corrupted by errors. MPEG-4 video includes the option to use RVLC for the DCT data. The RVLC for MPEG-4 is formed by concatenating a variable-length code (VLC) with a fixed-length code (FLC) where the FLC part has 2 bits, including the sign bit. The VLC part either starts and ends with a 1 with all, if any, zeroes in between; or starts and ends with a 0 with exactly one 0 in between (the rest of the bits, if any, equal 1). That is, the VLC part is either 100 . . . 001 (with possibly no 0s) or 01 . . . 101 . . . 10 with position of the interior 0 anywhere among the 1 s. The longest valid RVLC codeword is 15 bits plus a sign bit, so the VLC part can be as long as 14 bits. Contrarily, the VLC part can be as short as 2 bits (11 if starting with a 1) or 3 bits (000 if starting with a 0). Because of the way the RVLC is designed, the codebook is very sparse. Also, not all RVLC codewords with this structure are used; see the following table which shows codewords 01 1111 1101 1110 through 01 1111 1111 1100 are not used among the 14-bit VLC part. Codewords are defined for the 169 most commonly occurring events (combinations of last, level, and run), and an escape codeword for all other cases. The RVLC structure does not lend itself to the type of table lookup strategy that is used for regular VLCs. With the typical VLC lookup strategy, 15 bits (without the sign bit) would be read from the bitstream. A single table would require 215=32K (0x0000 to 0x7FFF in hexadecimal) entries, or the table could be partitioned according to the number of leading zeroes. However, for the MPEG-4 RVLC, this doesn""t help much: there are 24 codewords beginning with a 1, 22 codewords with two leading 0""s, only two codewords with three or more leading 0""s, and all other 122 codewords have exactly one leading 0. The lookup table for codewords with exactly one leading 0 would have to cover indices 010 00xx xxxx xxxx through 011 1111 0111 1101, requiring 0x3F7Dxe2x88x920x2000=0x1F7D (=8061 decimal) entries for the 122 codewords that begin with a single leading 0. Thus, the normal VLC decoding approach is very inefficient for RVLC decoding. Because RVLC codewords require so much memory to use the typical VLC decoding approach, the known MoMuSys software uses a brute force approach for decoding RVLC of MPEG-4 Table B-23 as follows. A test is made, comparing with each possible codeword (not counting the sign bit). When the match is found, the index into a densely-packed lookup table is hard-coded. The MoMuSys RVLC decode is implemented using one gigantic case statement. For efficiency, it is best to test for the shorter, more common codewords first. However, the worst-case cycles can be quite high, if all 169 cases are tested before finding a match. And this is not sufficiently efficient for effective use of RVLC decoding with MPEG4.
The present invention uses a codeword hashing index to access reversible VLC (RVLC) tables such as in MPEG and H.263.
This has the advantage of better performance and smaller memory requirements for RVLC decoding.