Video sequences can be represented as progressive or interlaced signals. While providing a simple orthogonal way of presenting video material, progressive sampling demands large amounts of bandwidth. Interlaced sampling was created to alleviate bandwidth requirements by sub-sampling a video frame into even and odd lines of video at different times, therefore reducing bandwidth by half. The vertical resolution of a video sequence sampled in the interlaced format is essentially equivalent to the vertical resolution of the progressive representation when there is no motion in the sequence. However, when there is movement in the sequence, and since even and odd lines are sampled at different times, video frames may show visible artifacts due to interfield motion.
Interlaced coding is an important feature of many coding standards such as the MPEG-2 standard and the H.264/AVC or MPEG-4 part 10 standards (International Organization for Standarization ISO/IEC JTC 1/SC 29 WG/11, ISO/IEC 14496-10 Advanced Video Coding Standard 2005, H.264/AVC Video Coding Standard Document). While it is possible to code all interlaced material as separate fields (i.e., Field Coding), some material is more efficiently coded as progressive frames (i.e., Frame Coding). The global selection between field coding and frame coding is referred to as Adaptive Frame/Field coding (AFF). Better compression efficiency can be obtained by adaptively coding each individual macroblock as either progressive (frame) or interlaced (field). The latter approach is known as Macroblock Adaptive Frame/Field coding, or MBAFF.
Although the H.264/AVC Standard provides better interlaced coding mechanisms than other previous international standards, the problem of properly selecting macroblocks for frame or field coding remains. Improperly selecting material for frame coding when the material should have been coded as interlaced (and vice versa) can cause deleterious effects in coding efficiency and, therefore, quality.
One approach to selecting between frame coding and field coding is (i) to code every frame as progressive (frame coding) and interlaced (field coding) and (ii) to code each macroblock as progressive (frame coding) and interlaced (field coding). A final selection is then made as to the best choice in terms of target rate and distortion. Such an approach is taken by the JM reference software developed by the ISO/MPEG Committee (International Organization for Standarization ISO/IEC JTC 1/SC 29 WG/11; ISO/IEC 14496-10 Advanced Video Coding Standard 2005; JM Software Model 10.6). While effective, the above technique requires large amounts of processing power since the material has to be coded multiple times in order to arrive at the optimal solution. The above technique may be referred to as a brute force approach.
A second approach to selecting between frame coding and field coding is to analyze input video at the frame/field level, and together with group of picture (GOP) and rate control criteria, make decisions to code the material as entire frames or fields. Such an approach is described by X. Zhang, A. Vetro, H. Sun, Y. Shi: Adaptive Field/Frame Selection for High Compression Coding. Mitsubishi Electric Research Laboratory Report TR-2003-29, January 2003. However, the second technique uses relatively complex variance computations and relies on knowledge of GOP structures for better performance. Furthermore, the second technique does not address MBAFF coding.
In principle, the best way to code interlaced material is to adaptively code each macroblock as frame or field. Therefore, the selection between frame coding and field coding can be simplified by deriving statistics from the motion vectors obtained by the motion estimation process. When these vectors are examined in a small area, taking into account spatial predictors, a decision can be made as to when to code a macroblock in frame mode or field mode. Such an approach is described in Y. Qu, G. Li, Y. He: A Fast MBAFF Mode Prediction Strategy for H.264/AVC, ICSP Proceedings 2004, p 1195-1198 (Qu et al.).
The approach described by Qu et al. first determines the need for coding the entire frame in progressive or interlace mode, and if the latter case is selected, then the macroblock based decisions are performed. Variances are used as statistical measures for each macroblock. The approach described by Qu et al. has the disadvantage of relying on motion vectors obtained by a motion estimation process that is ruled by rate distortion characteristics that may not fit the nature of interlaced video (i.e., the prediction error minimization is not a good indicator of the interlaced nature of the content). Variances are obtained for frame and field coding modes based on the mean of a large number of macroblocks.
Yet another simplification to the brute force approach is to look at macroblock activity measures based on the sum of absolute differences (SAD) for each macroblock. Such an approach is described in M. Guerrero, R. Tsang, J. Chan: Fast Macroblock Adaptive Frame/Field Coding Selection in H.264, Stanford EE398 Class, Spring 2005 (Guerrero et al.). Together with motion vector analysis and macroblock neighbor considerations, the approach in Guerrero et al. can reduce the effort in classifying the macroblocks for frame or field coding. However, the approach in Guerrero et al. only uses adjacent vertical pixels to derive the activity measure for the macroblock, and furthermore relies on motion vectors that are derived in the normal motion estimation process and, therefore, are optimized to reduce prediction error without consideration to actual interlaced characteristics. Moreover, when considering neighbors, any incorrect coding decisions can be easily propagated in the rest of the picture.