The problem of error control and concealment in video communication becomes increasingly important because of the growing interest in video delivery over unreliable channels such as wireless networks and the Internet. Real-time video transmission is a challenging task, especially when typical error handling mechanisms such as retransmission cannot be used.
The H.264 standard [as discussed in: “Advanced Video Coding for Generic Audiovisual Services,” ISO/IEC 14496-10 and ITU-T Recommendation H.264, November 2007], with its network friendly approach, has introduced new coding tools to deal with the challenging task of sending video from one device to another. Flexible Macroblock Ordering (FMO) [as discussed in: S. Wenger. “H.264/AVC over IP,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no 7, p. 645-656, July 2003] was introduced to break the traditional raster scan ordering allowing packets to hold non-consecutive macroblocks (MBs). As a result, packet loss has a less impact, and error concealment mechanisms offer more information (boundaries) to produce better visual results.
Arbitrary Slice Ordering (ASO) [T. Wiegand, G. J. Sullivan, G. Bjontegaard and A. Luthra. “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, p. 560-576, July 2003] was introduced to break up the relationship between packets, making every slice/packet independently decodable. ASO also enhances the robustness of the data to packet loss. Data partitioning [as discussed in: S. Wenger. “H.264/AVC over IP,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no 7, p. 645-656, July 2003], which is available with the Extended Profile, took things one step further by splitting the prediction information (MB types, motion vectors, etc.) from the residual information (luminance and chrominance values), based on the argument that the prediction information was more important than the residual information. Researchers have verified this assumption by applying Unequal Error Protection (UEP) schemes where data partitions A, partitions carrying all syntax element belonging to Category 2, were better protected. Intra placement, although not a new feature, was enhanced to allow Intra MBs to use information from neighboring Inter MBs.
These tools all share the same two goals: 1) to enhance robustness to data loss and 2) to assist error concealment/resynchronization. They also share the same drawback as the added robustness comes at the cost of reduced compression efficiency, resulting in lower visual quality compared to a non error resilient scheme when there is no error. A careful compromise between error robustness and coding efficiency, as well as an adequate choice of FEC, is a difficult problem, mainly because channel conditions in wireless environments vary over time.
Error concealment, on the other hand, does not require additional bandwidth, i.e. doesn't sacrifice bandwidth for error protection. Using correctly received information as well as information regarding a previous picture, it estimates the value of the missing pixels to reconstruct pictures when errors occur. However, its performance is greatly reduced when lost regions have low spatiotemporal correlation with neighboring regions that have been correctly decoded.
Starting with Sun [H. Sun and W. Kwok. “Concealment on Damaged Block Transform Coded Images Using Projections onto Convex Sets,” IEEE Transactions on Image Processing, vol. 4, no 4, p. 470-477, April 1995] and Kwok's [W. Kwok and H. Sun. “Multi-directional interpolation for spatial error concealment,” IEEE Transactions on Consumer Electronics, vol. 39, no 3, p. 455-460, June 1993] initial work, researchers have published spatial, temporal and hybrid approaches to interpolate lost pixels. The common denominator throughout error concealment literature is that transmission errors only arise in the form of packet loss where corrupted packets are always discarded. However, video data falls into the class of applications that benefit from having damaged data delivered rather than discarded [as discussed in L. A. Larzon, M. Degemark and S. Pink. “UDP lite for real time multimedia,” IEEE International Conference on Communications, June 1999]. Wenger conveniently illustrates the use of forbidden_zero_bit in a scenario where a smart node forwards a corrupted Network Abstraction Layer Unit (NALU) to its destination [as discussed in: S. Wenger. “H.264/AVC over IP,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no 7, p. 645-656, July 2003]. Assuming that corrupted packets do reach the decoder, Weidmann [as discussed in: C. Weidman, P. Kadlec, O. Nemethova and A. Al Moghrabi. “Combined sequential decoding and error concealment of H.264 video,” IEEE 6th Workshop on Multimedia Signal Processing, p. 299-302, September 2004] uses Joint Source-Channel Decoding (JSCD) to correct the Context-Adaptive Variable-Length Coding (CAVLC) prediction residual coefficients found in data partitions B and C using the number of MBs extracted from the always intact partitions A to impose additional constraints on the solution.
Wang and Yu [as discussed in: Y. Wang and S. Yu. “Joint Source-Channel Decoding for H.264 Coded Video Stream,” IEEE Transactions on Consumer Electronics, vol. 51, no 4, p. 1273-1276, November 2005] apply JSCD to correct motion vectors. Their experiment does not comply with the H.264 standard however, as partitions B and C carry horizontal and vertical motion vectors respectively. Sabeva [as discussed in: G. Sabeva, S. Ben Jamaa, M. Kieffer and P. Duhamel. “Robust Decoding of H.264 Encoded Video Transmitted over Wireless Channels,” IEEE 8th Workshop on Multimedia Signal Processing, p. 9-13, October 2006] applies JSCD to Context-Adaptive Binary Arithmetic Coding (CABAC) encoded bitstreams on the assumption that each packet carries an entire picture, and so that the number of MBs in a packet is known a priori. As both picture resolution and visual quality both increase, the solution becomes increasingly complex computationally.
Lee [as discussed in W. T. Lee, H. Chen, Y. Hwang and J. J. Chen. “Joint Source-Channel Decoder for H.264 Coded Video Employing Fuzzy Adaptive Method,” IEEE International Conference on Multimedia and Expo, p. 755-758, July 2007] proposes to use Fuzzy Logic to feed information back to the channel decoder, although they provide very few details are given about their Fuzzy Logic engine.
Levine [as discussed in: D. Levine, W. E. Lynch and T. Le-Ngoc. “Iterative Joint Source-Channel Decoding of H.264 Compressed Video,” IEEE International Symposium on Circuits and Systems, p. 1517-1520, May 2007] and Nguyen [as discussed in: N. Q. Nguyen, W. E. Lynch and T. Le-Ngoc. “Iterative Joint Source-Channel Decoding for H.264 video transmission using virtual checking method at source decoder,” 23rd Canadian Conference on Electrical and Computer Engineering, p. 1-4, May 2010] both apply iterative JSCD to a CABAC coded stream. A Slice Candidate Generator produces a list of hypothetical slices by flipping one or more bits in the corrupted slices received. Each candidate is then studied at the semantic level. The bits that seem to have been correctly fixed are fed back into the channel decoder between iterations, until the likeliest bitstream is selected.
Farrugia [as discussed in: R. Farrugia and C. Debono. “Robust decoder-based error control strategy for recovery of H.264/AVC video content,” IET Communications, vol. 5, no 13, p. 1928-1938, September 2011] uses a list decoding approach, where the M likeliest feasible bitstreams are reconstructed and evaluated in the pixel domain. Using a value of M=5, the approach produces very good visual results. However, its high computational complexity makes it prohibitively costly for most applications. Weideman applies JSCD to the CAVLC prediction residual coefficients found in data partitions B and C. Using the information from the intact partition A, a stack of hypothetical bitstreams is built and sorted in descending order using a metric in the pixel domain. However, this approach relies heavily on the fact that data partitions A are immune to transmission errors. Without knowledge of the number of MBs present in data partitions B and C, the stack of hypothetical bitstreams cannot be populated.
Trudeau [as discussed in: L. Trudeau, S. Coulombe and S. Pigeon. “Pixel domain referenceless visual degradation detection and error concealment for mobile video,” 18th IEEE International Conference on Image Processing, p. 2229-2232, September 2011] proposes to a two-step solution: decoding the corrupted stream without a list of candidates, and concealing the potentially lost MBs. The fit of the decoded and concealed MBs,—the way they connect to the correctly received MBs in the pixel domain—, are then compared, and the best fitting MBs are selected for display. Trudeau does not assume the use of any error resilience or data portioning method.
Accordingly, there exists a need for an improved method and system to address a problem of video packets communicated across communications systems where the video information is altered or lost during transmission due to channel noise or unreliable networks thereby resulting in for example, random bit errors or packet losses in a packet network, damaging compressed video stream and leading to visual distortion at a decoder. Additionally, there exists a need for an improved video correction method and system that obviates or mitigates at least some of the above-presented disadvantages of existing systems.