Inter and intra coding methods can both be used to encode interframes in accordance with various video compression standards. Intra coding uses only spatial correlation while inter coding uses temporal correlation from previously coded frames. In general, inter coding is used for macroblocks that are well predicted from previous pictures, and intra coding is used for macroblocks that are not well predicted from previous pictures, or for macroblocks with low spatial activity.
Typically, an encoder makes an inter/intra coding decision for each macroblock based on coding efficiency and subjective quality considerations. In the JVT/H.264/MPEG AVC (“JVT”) standard, inter coding allows various block partitions and multiple reference pictures to be used for predicting a 16×16 macroblock.
The JVT encoder uses tree-structured hierarchical macroblock partitions. Inter-coded 16×16 pixel macroblocks may be broken into macroblock partitions, of sizes 16×8, 8×16, or 8×8. Macroblock partitions of 8×8 pixels are also known as sub-macroblocks. Sub-macroblocks may be further broken into sub-macroblock partitions, of sizes 8×4, 4×8, and 4×4. An encoder may select how to divide the macroblock into partitions and sub-macroblock partitions based on the characteristics of a particular macroblock, in order to maximize compression efficiency and subjective quality.
Furthermore, JVT also supports INTRA, SKIP and DIRECT modes. Intra modes allow three types: INTRA4×4, INTRA16×16, and INTRA8×8 which is a Fidelity Range extensions mode only. INTRA4×4 and INTRA8×8 support 9 prediction modes: vertical; horizontal; DC, diagonal down/left; diagonal down/right; vertical-left; horizontal-down; vertical-right; and horizontal-up prediction. INTRA16×16 supports 4 prediction modes: vertical; horizontal; DC; and plane prediction.
Multiple reference pictures may be used for inter-prediction, with a reference picture index coded to indicate which of the multiple reference pictures is used. In P pictures (or P slices), only single directional prediction is used, and the allowable reference pictures are managed in list 0. In B pictures (or B slices), two lists of reference pictures are managed, list 0 and list 1. In B pictures (or B slices), single directional prediction using either list 0 or list 1 is allowed, or bi-prediction using both list 0 and list 1 is allowed. When bi-prediction is used, the list 0 and the list 1 predictors are averaged together to form a final predictor.
Each macroblock partition may have an independent reference picture index, prediction type (list 0, list 1, bipred), and an independent motion vector. Each sub-macroblock partition may have independent motion vectors, but all sub-macroblock partitions in the same sub-macroblock use the same reference picture index and prediction type.
For inter-coded macroblocks, besides the above macroblock partition, P frame also supports SKIP mode, while B frame supports both SKIP mode and DIRECT mode. In SKIP mode, no motion and residual information are encoded. The motion information for a SKIP macroblock is the same as a motion vector predictor specified by the picture/slice type (P or B), and other information such as sequence and slice level parameters, and is related to other temporally or spatial adjacent macroblocks and its own macroblock position within the slice. In contrast, in DIRECT mode, no motion information is encoded, but prediction residue is encoded. Both macroblocks and sub-macroblocks support DIRECT mode.
As for mode decision, inter pictures need to support both inter and intra modes. Intra modes include INTRA4×4 and INTRA16×16. For P pictures, inter modes include SKIP and 16×16, 16×8, 8×16 and sub-macroblock 8×8 partitions. 8×8 further supports 8×8, 8×4, 4×8 and 4×4 partitions. For B pictures, both list 0 and list 1 and DIRECT mode are considered for both macroblocks and sub-macroblocks.
In the prior art, a Rate-Distortion Optimization (RDO) framework is used for mode decision. For inter modes, motion estimation is separately considered from mode decision. Motion estimation is first performed for all block types of inter modes, then the mode decision is made by comparing the cost of each inter mode and intra mode. The mode with the minimal cost is selected as the best mode.
A conventional procedure to encode one macroblock s in a P- or B-picture (hereinafter the “conventional macroblock encoding procedure”) is summarized as follows.
In a first step of the conventional macroblock encoding procedure, given the last decoded pictures, we decide the Lagrangian multiplier λMODE, λMOTION, and the macroblock quantizer QP.
In a second step of the conventional macroblock encoding procedure, motion estimation and reference picture selection are performed by minimizingJ(REF,m(REF)|λMOTION)=SA(T)D(s,c(REF,m(REF)))+λMOTION(R(m(REF)−p(REF))+R(REF))for each reference picture and motion vector of a possible macroblock mode. In the preceding equation, m is the current motion vector being considered, REF denotes the reference picture, p is the motion vector used for the prediction during motion vector coding, R(m−p) represents the bits used for coding motion vector and R(REF) is the bits for coding reference picture. SAD denotes the Sum of Absolute Differences between original signal and reference signal predicted by the motion vector.
In a third step of the conventional macroblock encoding procedure, the macroblock prediction mode is chosen by minimizingJ(s,c,MODE|QP,λMODE)=SSD(s,c,MODE|QP)+λMODE·R(s,c,MODE|QP),given QP and λMODE when varying MODE. SSD denotes the Sum of Square Differences between the original signal and the reconstructed signal. R(s,c,MODE) is the number of bits associated with choosing MODE, including the bits for the macroblock header, the motion and all DCT coefficients. MODE indicates a mode out of the set of potential macroblock modes:
      P    ⁢          -        ⁢    frame    ⁢          :                  MODE      ⁢                          ∈              {                                                                              INTRA                  ⁢                                                                          ⁢                  4                  ×                  4                                ,                                  INTRA                  ⁢                                                                          ⁢                  16                  ×                  16                                ,                SKIP                ,                                                                                                          16                  ×                  16                                ,                                  16                  ×                  8                                ,                                  8                  ×                  16                                ,                                  8                  ×                  8                                ,                                  8                  ×                  4                                ,                                  4                  ×                  8                                ,                                  4                  ×                  4                                                                    }              ,                  ⁢          B      ⁢              -            ⁢      frame      ⁢              :                  MODE    ⁢                  ∈                  {                                                                              INTRA                  ⁢                                                                          ⁢                  4                  ×                  4                                ,                                  INTRA                  ⁢                                                                          ⁢                  16                  ×                  16                                ,                                                                  ⁢                DIRECT                ,                                  DIRECT_                  ⁢                  8                  ×                  8                                                                                                                          L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  16                  ×                  16                                ,                                  L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  16                  ×                  8                                ,                                  L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  8                  ×                  16                                ,                                  L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  8                  ×                  8                                ,                                  L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  8                  ×                  4                                ,                                                                                                          L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  4                  ×                  8                                ,                                  L                  ⁢                                                                          ⁢                  0                  ⁢                  _                  ⁢                  4                  ×                  4                                ,                                  L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  16                  ×                  16                                ,                                  L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  16                  ×                  8                                ,                                  L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  8                  ×                  16                                ,                                                                                                          L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  8                  ×                  8                                ,                                  L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  8                  ×                  4                                ,                                  L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  4                  ×                  8                                ,                                  L                  ⁢                                                                          ⁢                  1                  ⁢                  _                  ⁢                  4                  ×                  4                                ,                                  Bi_                  ⁢                  16                  ×                  16                                ,                                                                                                          Bi_                  ⁢                  16                  ×                  8                                ,                                  Bi_                  ⁢                  8                  ×                  16                                ,                                  Bi_                  ⁢                  8                  ×                  8                                ,                                  Bi_                  ⁢                  8                  ×                  4                                ,                                  Bi_                  ⁢                  4                  ×                  8                                ,                                  Bi_                  ⁢                  4                  ×                  4                                ,                                                    }            .      The INTRA4×4 includes modes:
                    MODE        ⁢                                  ∈                  {                                                                      vertical                  ,                  horizontal                  ,                  DC                  ,                                      diagonal                    -                                          down                      ⁢                                              /                                            ⁢                      left                                                        ,                                      diagonal                    -                                                                                                                                            down                    ⁢                                          /                                        ⁢                    right                                    ,                                      vertical                    -                    left                                    ,                                      horizontal                    -                    down                                    ,                                      vertical                    -                                                                                                                        right                  ,                                      horizontal                    -                    up                                                                                }                                                and INTRA16×16 includes modes: MODE ε{vertical, horizontal, DC, plane}.
With respect to the conventional macroblock encoding procedure, a conventional fast mode selection was introduced which could considerably reduce the complexity of mode decision while having little impact in quality by considering that the mode decision error surface is more likely to be monotonic and therefore if certain modes are examined first it might be simpler to find the best mode. If mode decision for a given mode is not performed, then this essentially implies that motion estimation also is not performed, the latter being the most costly part of encoding even if a fast motion estimation algorithm is used. More specifically, in this approach SKIP and 16×16 modes were examined first. According to their distortion relationship (i.e. (J(SKIP)<J(16×16)) and the availability of residual, a further decision was made whether or not to terminate the search. Otherwise, J(8×8) and J(4×4) were also computed. Based on the relationship of J(16×16), J(8×8), and J(4×4), additional decisions were made to determine which of the remaining block sizes should be tested. For example, if the distortion is monotonic (i.e., J(16×16)>J(8×8)>J(4×4) or J(16×16)<J(8×8)<J(4×4)), then it can easily be determined which additional partitions should be examined. For the first case, for example, only small partitions (8×4 and 4×8) are tested, while in the second case only 16×8 and 8×16 are examined. If the distortion is not monotonic, then all possible modes are tested.
In a different conventional fast mode decision approach, additional conditions were introduced based on the distortion values (see FIG. 1 below) and the relationships between different modes (see FIG. 2 below), which allowed the search to terminate even faster without much impact in quality.
Turning to FIG. 1, a method for motion vector and mode decision based on distortion values is generally indicated using the reference numeral 100. The method 100 includes a start block 102 that passes control to a function block 104. The function block 104 checks SKIP mode and 16×16 mode, and passes control to a decision block 106. The decision block 106 determines whether or not the distortion in SKIP mode, J(SKIP), is less than the distortion in 16×16 mode, J(16×16), and whether or not 16×16 mode has any residue. If the distortion in SKIP mode is not less than the distortion in 16×16 mode and/or 16×16 mode has a residue, then control is passed to a function block 108. Otherwise, if the distortion in SKIP mode is less than the distortion in 16×16 mode and 16×16 mode has no residue, then control is passed to a decision block 126.
The function block 108 checks 8×8 mode for a current (i.e., currently evaluated) 8×8 sub-partition, and passes control to a decision block 110 and to a function block 114. The decision block 110 determines whether or not 8×8 mode has the same motion information as 16×16 mode for the current 8×8 sub-partition. If 8×8 mode does not have the same motion information as 16×16 mode for the subject sub-partition, then control is passed to a function block 112. Otherwise, if 8×8 mode has the same motion information as 16×16 mode for the current 8×8 sub-partition, then control is passed to a function block 114.
The function block 112 checks 16×8 and 8×16 sub-partitions, and passes control to function block 114.
The function block 114 checks 4×4 mode for a current 4×4 sub-partition, and passes control to a decision block 116 and to a function block 120. The decision block 116 determines whether or not 4×4 mode has the same motion information as 8×8 mode for the current 4×4 sub-partition. If 4×4 mode does not have the same motion information as 8×8 mode for the current 4×4 sub-partition, then control is passed to a function block 118. Otherwise, if 4×4 mode has the same motion information as 8×8 mode for the current 4×4 sub-partition, then control is passed to a function block 120.
The function block 118 checks 8×4 and 4×8 sub-partitions, and passes control to function block 120.
The function block 120 checks intra modes, and passes control to a function block 122. The function block 122 selects the best mode from among the evaluated modes, and passes control to an end block 124. The end block 124 ends the macroblock encoding.
The decision block 126 determines whether or not SKIP mode has the same motion information as 16×16 mode for a current (i.e., currently evaluated) 16×16 MB. If SKIP mode does not have the same motion information as 16×16 mode for the current 16×16 MB, then control is passed to decision block 108. Otherwise, if SKIP mode has the same motion information as 16×16 mode for the current 16×16 MB, then control is passed to function 120.
Turning to FIG. 2, a method for motion vector and mode decision based on relationships between different modes is generally indicated using the reference numeral 200. The method 200 includes a start block 202 that passes control to a function block 204. The function block 204 checks SKIP mode and 16×16 mode, and passes control to a decision block 206. The decision block 206 determines whether or not MC2>T1, where MC2=min(J(SKIP), J(16×16)), the minimum distortion between SKIP mode and 16×16 mode, and T1 is the first threshold. If MC2<=T1, then control is passed to a decision block 208. Otherwise, if MC2>T1, then control is passed to a function block 210 and a function block 212.
The decision block 208 determines whether or not MC2 is greater than T2 (a second threshold). If MC2 is not greater than T2, then control is passed to function block 210 and function block 212. Otherwise, if MC2 is greater than T2, then control is passed to a function block 218.
The function block 210 checks other inter modes, and passes control to a function block 212. The function block 212 checks other non-tested intra modes, and passes control to a function block 214. The function block 214 selects the best mode from among the evaluated modes, and passes control to an end block 216. The end block 216 ends the macroblock encoding.
The function block 218 checks the intra4×4 DC, and passes control to a decision block 220. The decision block 220 determines whether or not J(INTRA4×4 DC) is less than a*MC2+b, where a and b are constants. If J(INTRA4×4 DC) is not less than a*MC2+b, then control is passed to function block 210 and function block 212. Otherwise, if J(INTRA4×4 DC) is less than a*MC2+b, then control is passed to the function block 212.
In another different conventional fast mode decision approach, a picture was first analyzed using simple methods such as homogeneity analysis and stationarity detection. Homogeneity analysis can be performed by considering simple statistical measurements such as standard deviation or variance, skewness and kyrtosis. Unfortunately, these metrics might not be as appropriate for real time implementations. The determination of which modes should be considered was also somewhat based on a yet different conventional approach using a fast intra decision and, in particular, relating to edge direction. A method relating to the approach that uses homogeneity analysis and stationarity detection can be seen in FIG. 3, where modes 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4 are sequentially assigned to modes 1 through 7.
Turning to FIG. 3, a method for mode decision using homogeneity and stationarity is generally indicated using the reference numeral 300. The method 300 includes a start box 302 that passes control to a function block 304. The function block 304 performs edge detection, and passes control to a function block 306. The function block 306 performs fast intra mode decision, and passes control to a function block 308. The function block 308 sets mode 1 to mode 7 flags, and passes control to a decision block 310. The decision block 310 determines whether or not a subject (i.e., currently evaluated) 16×16 macroblock (MB) has zero motion. If the 16×16 MB does not have zero motion, then control is passed to a decision block 312. Otherwise, if the 16×16 MB does have zero motion, then control is passed to a function block 318.
The decision block 312 determines whether or not the 16×16 MB is homogenous. If the 16×16 MB is not homogenous, then control is passed to a decision block 314. Otherwise, if the 16×16 MB is homogenous, then control is passed to a function block 328.
The decision block 314 determines whether or not each 8×8 sub-block of the 16×16 block is homogenous. If each 8×8 sub-block is not homogenous, then control is passed to a decision block 316. Otherwise, if each 8×8 sub-block is homogenous, then control is passed to a function block 332.
The decision block 316 determines whether or not a subject 8×8 sub-block is the last sub-block in the 16×16 MB. If the 8×8 sub-block is not the last sub-block in the 16×16 MB, then control is returned to step 314. Otherwise, if the 8×8 sub-block is the last sub-block in the 16×16 MB, then control is passed to a function block 324. The function block 324 performs motion estimation on different block sizes only for modes that have set flags, and passes control to an end block 326. The end block 326 ends the macroblock encoding.
The function block 318 computes the MB difference, and passes control to a decision block 320. The decision block 320 determines whether or not the MB difference is less than a pre-specified threshold. If the MB difference is not less than a pre-specified threshold, then control is passed to step 312. Otherwise, if the MB difference is less than a pre-specified threshold, then control is passed to a function block 322.
The function block 322 clears all mode flags except mode 1, and passes control to the function block 324.
The function block 328 clears all mode 4, 5, 6, and 7 flags, and passes control to a function block 330. The function block 330 clears the mode 2 flag when intra vertical prediction is selected, clears the mode 3 flag when intra horizontal prediction is selected, otherwise clears modes 2 and 3, and then passes control to the function block 324.
The function block 332 clears the mode 5, 6, and 7 flags for the 8×8 sub-block, and passes control to the decision block 316.
Inter mode decision is associated with motion estimation, various block sizes and multiple reference picture selection. Intra mode decision is associated with various block types and multiple spatial prediction mode selection. Therefore, mode decision for interframes incurs a big burden on the encoder.
Accordingly, it would desirable and highly advantageous to have a method and apparatus for performing a fast mode decision for interframes that lessens the burden on the encoder.