The increasing availability of mobile devices with video coding capabilities and of video wireless sensors, poses a challenge for reliable video transmission. Indeed, due to the high error rate that may occur in the transmission channel, portions of a transmitted frame may be lost or corrupted, thereby drastically reducing the quality of the received image. The reliable reconstruction of missing blocks becomes particularly relevant in differential coding systems, like H.264/AVC, since reconstruction errors in one frame can propagate to successive frames of the video sequence.
Video coding standards, such as the H.264/AVC, propose several solutions for mitigating the problem of packet losses for transmissions over unreliable channels. Firstly, a corrupted frame is decoded and a list of the corrupted macroblocks in the frame is created. In Inter-coded frames the temporal correlation between subsequent frames is high. Therefore, some techniques use information from past and future reference frames to recover the pixels of the lost macroblocks in the current frame. The typical procedure consists of replacing each block of pixels, within the corrupted region, with a block, or a combination of blocks, from the reference frames, pointed to by an appropriately calculated motion vector.
The JVT concealment algorithm is included in the reference implementation of the H.264/AVC standard, the reference implementation including the reference JM decoder developed by the JVT committee (Joint Video Team of ISO/IEC MPEG and ITU-T VCEG). This algorithm subdivides each 16×16 lost macroblock into four 8×8 sub-blocks. For each of the missing sub-blocks, a candidate set of motion vectors is constructed, which includes the motion vectors computed from those of the received adjacent macroblocks, and the null motion vector, corresponding to no motion. The motion vector associated with each adjacent 8×8 block is simply copied or extrapolated from the motion information of the macroblock that contains it, depending on its coding mode. All the motion vectors in the candidate set are tested. The missing macroblock is replaced by the one in the reference frame that minimizes the boundary matching error (BME), defined as the sum of absolute differences (SAD) between the boundary pixels of the candidate macroblock in the reference frame, and the boundary pixels of the blocks surrounding the lost one.
According to the above described scheme, the best candidate sub-block for replacing the missing one is chosen on the basis of a measure of the luminance distortion across the boundary between the candidate macro-block and the correctly received sub-block(s) adjacent to the missing macroblock. This solution may provide acceptable results only if the metric is calculated along a relatively long border and may require that the macroblock is subdivided in 8×8 or 16×16 pixels sub-blocks. Therefore such a method may not be capable of reliably reconstructing small details and fine movements. Moreover, since the choice is made considering only the luminance distortion, this method does not take into account the motion characteristics of the area surrounding the lost macroblock.
An improvement of the concealment method described above is proposed in the article from Y. Xu and Y. Zhou, “H.264 video communication based refined error concealment schemes,” Consumer Electronics, IEEE Transactions on, vol. 50, pp. 1135-1141, November 2004, which is incorporated by reference. This document proposes a refined motion compensated temporal concealment that considers motion vectors of all the macroblocks surrounding the lost macroblock, as well as the average and median motion vectors. In this algorithm, each macroblock is subdivided into 8×8 blocks and the best motion vector is found by minimizing a metric defined as the weighted sum of the SAD between pixels qi of the candidate macroblock and the pixels ri,h and ri,v of horizontally and vertically adjacent blocks, respectively. This approach is schematically described in FIG. 21.
Also this reference selects the replacement sub-block by calculating the luminance distortion across an 8-pixel border. Moreover, although the method considers the sub-blocks in the reference frame pointed at by several motion vectors of macroblocks surrounding the lost macroblock, the motion information of the area including the lost macroblock is not used in the selection procedure of the best replacement candidate.
A further error concealing procedure is described in the article by M.-C. Chi, M.-J. Chen, J.-H. Liu, and C.-T. Hsu, “High performance error concealment algorithm by motion vector refinement for mpeg-4 video,” in IEEE International symposium on Circuits and Systems, vol. 3, pp. 2895-2898, May 2005, which is incorporated by reference. In this reference, the recovery procedure is done recursively, starting from candidate motion vectors calculated from the macroblocks located above and below the missing one, and refined in a second pass that takes into account the lateral blocks. More precisely, a set of candidate motion vectors is generated by combining collocation motion vectors in the reference frame and the difference of top/bottom motion vectors. The collocation motion vector of the lost macroblock in the previous frame is the base motion vector. For each candidate motion vector, a side match function is evaluated between the known valid pixels and the motion-compensated ones. The motion vector with minimum side matching distortion is selected as the concealed motion vector for the lost macroblock.
Although the above-described method chooses the candidate motion vectors by considering the information contained in correctly received motion vectors in the reference frame, this solution also provides acceptable results only if the metric is calculated along a relatively long border, and requires that the macroblock is subdivided in 8×8 or 16×16 pixel sub-blocks. Therefore such a method may not be capable of reliably reconstructing small details and fine movements. Moreover, as in the previous methods, the choice of the best replacement candidate is done only considering the luminance distortion, thereby not taking into account the motion characteristics of the area surrounding the lost macroblock.
In the article by M.-J. Chen, W.-W. Liao, and M.-C. Chi, “Robust temporal error concealment for h.264 video decoder,” in International Conference on Consumer Electronics, pp. 383-384, January 2006, which is incorporated by reference, a method is proposed, where the motion vectors used for the recovery of the missing macroblock are extrapolated from the motion vectors of the surrounding blocks. In particular, the proposed technique considers 4×4 blocks in the corrupted frame, and the candidate motion vector is computed from a linear combination of the motion vectors of surrounding blocks. A similar technique is described in the reference “A temporal error concealment algorithm for h.264 using lagrange interpolation,” by J. Zheng and L.-P. Chau, published in IEEE International Symposium on Circuits and Systems, pp. 133-136, 2004, which is incorporated by reference. In this case, the candidate motion vector is computed using a polynomial interpolation.
The methods described in the above-cited references perform a boundary matching on a 4×4 pixels sub-block. However also in this case, the best candidate replacement sub-block is chosen among a limited number of sub-blocks in the reference frame, i.e. the sub-block pointed at by the interpolated motion vectors. Moreover, also in this case, the choice of the best replacement candidate is done only considering the luminance distortion, thereby not taking into account the motion characteristics of the area surrounding the lost macroblock. A further disadvantage of these techniques is that the predictor motion vectors are generated by interpolating the motion vectors surrounding the lost macroblock. However, the predictors are generated in a static manner and the techniques give good results only if the motion is only reliably described by the interpolated predictors. There may be, however, no possibility to estimate the quality of the generated predictors. In conclusion, the methods described above only produce satisfactory reconstructions in a limited number of video sequences, such as the conventional “Foreman” sequence.
An adaptative concealment scheme is proposed in the paper by Y. Xu and Y. Zhou, “Adaptive temporal error concealment scheme for H.264/AVC video decoder,” IEEE Transactions on Consumer Electronics, vol. 54, pp. 1846-1851, November 2008, which is incorporated by reference. Here, the recovery procedure is applied to 16×16 or 8×8 blocks, according to the coding mode of surrounding correctly received macroblocks. Moreover, a variable size full search window is opened in the reference frame. All the motion vectors in the search window are tested, and the block that minimizes the weighted OBMA metric is selected to replace the missing block.
This method chooses the replacement sub-block among all the sub-block pointed at by motion vectors in a search window in the reference frame. However, the macroblock is subdivided into 8×8 or 16×16 sub-blocks. Consequently, this method may not be capable of reliably reconstructing small details and fine movements. Moreover, since the choice is made only considering the luminance distortion, this method may not take into account the motion characteristics of the area surrounding the lost macroblock.
In conclusion, all the methods described above select the replacement sub-block based only on the boundary distortion between the area in the candidate sub-blocks in the reference frame and the area including the missing block in a current frame. Moreover, although some of the methods consider a fine subdivision of the macroblock in order to correctly reconstruct small details, this methods may fail in reconstructing, in a reliable manner, a lost block in case of fast or non-homogeneous movements. However, in decoding systems, it may be desirable to reconstruct in a reliable manner lost sub-blocks, even in the case of fast or irregular movements, while controlling the computation load to enable error concealment in real time decoding schemes.