Self Derivation of Motion Estimation (SDME) is a process in video encoding and decoding in which motion vector information is derived at the decoder, rather than specifically transmitted or otherwise conveyed from the encoder to the decoder. Since the transmission or conveyance of motion vector information from the video encoder side to the video decoder side is skipped, a higher coding efficiency is achieved. In the state of the art coding schemes, SDME is performed only for the bi-predictive mode (or B prediction).
In greater detail, in the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”), a macroblock (MB) can be partitioned into various blocks for encoding and the motion vector is assigned to each partitioned block. To save the bandwidth for motion vector information, techniques to derive motion vectors at the video decoder side have been proposed in a first prior art approach to replace the original B-Skip mode with a derived motion vector by a new mirror-based motion search operation at both the encoder and the decoder. The work of self derivation of motion estimation is further explored with respect to a second prior art approach in order to design a new SDME coding mode with the merit of extending block size to increase the prediction accuracy.
Mirror motion estimation has been explored on SDME to predict the motion vector among forward and backward reference pictures. Turning to FIG. 1, an example of how mirror motion estimation is performed for the scenario of two B pictures between two reference pictures L0 and L1 is indicated generally by the reference numeral 100. In the example, reference picture L0 is denoted by FW Ref and reference picture L1 is denoted by BW Ref. The two B pictures are denoted by B0 and B1. Consider B0 as the current encoding picture. A motion vector between B0 and FW Ref is denoted by MV0, and a motion vector between B1 and BW Ref is denoted by MV1. The current encoding picture, namely B0 includes a current or target block 110. The reference picture FW Ref (as well as reference picture BW Ref, although not explicitly shown there for) includes a search window 120 and a reference (ref) block 125 within the search window 120. When encoding a target block in B0, the SDME can be generally described as follows:                1. Specify a search window in the forward reference picture.        2. Specify a search pattern in the forward search window. Full search or simplified fast search patterns can be the options to select and the same search pattern will apply on both the video encoder side and the video decoder side.        3. For motion vector MV0 in the forward search window, the mirror motion vector MV1 in the backward search window is derived as follows based on the temporal picture distance, where d0 is the distance between the current picture and the forward reference picture and d1 is the distance among the current picture and the backward reference picture:        
      MV    ⁢                  ⁢    1    =            -                        d          1                          d          0                      ⁢    MV    ⁢                  ⁢    0                  4. Calculate the cost metric of a motion search (using sum of absolute differences (SAD)) between the reference block (pointed by MV0) in the forward reference picture and the reference block (pointed by MV1) in the backward reference picture.        5. The SDME motion vector is selected as the MV0 candidate with the minimum SAD value in spiral order of all candidates in the search pattern.        
Using mirror ME, a pair of motion vectors MV0 and MV1 is derived. We denote the current target block as T. The forward prediction pixel in the forward reference picture R0, denoted as R0(MV0), can be found by MV0 in the forward reference picture. The backward prediction pixel in backward reference picture R1, denoted as R1(MV1), can be found by MV1 in the backward reference picture. The bi-directional prediction of SDME could be the average of R0(MV0) and R1(MV1), or the weighted average [R0(MV0)*d1+R1(MV1)*d0+(d0+d1)/2]/(d0+d1).
Multiple block partitions can be available for SDME. The encoder and decoder should adopt the same partition pattern through the coding syntax used at both sides. Block partitions of 16×16, 16×8, 8×16, and 8×8 have been applied to the bi-prediction coding modes, and the 8×8 block partition is in use only in the direct_8×8 coding mode. According to the second prior art approach, the SDME technique is applied to the following traditional coding modes with a flag control bit to signal if SDME or the traditional MPEG-4 AVC Standard method is applied to derive the motion vector:
B_Skip, B_Direct_16×16, B_Bi_16×16
B_L0_Bi_16×8, B_L0_Bi_8×16, B_Bi_L0_16×8, B_Bi_L0_8×16,
B_L1_Bi_16×8, B_L1_Bi_8×16, B_Bi_L1_16×8, B_Bi_L1_8×16,
B_Bi_Bi_16×8, B_Bi_Bi_8×16.
B_Direct_8×8 (Use SDME directly for Direct_8×8. No flag bit is needed)
To improve motion vector accuracy, extended block size including the neighboring reconstructed pixels in the current picture into cost metric can be applied as shown in FIG. 2. Turning to FIG. 2, an example of a current block with available reconstructed neighboring blocks is indicated generally by the reference numeral 200. The example 200 involves the current block (so designated in FIG. 2) and neighboring blocks A0, A1, A2, and A3.
However, all of the prior art approaches involving SDME only apply SDME to the prediction of bi-predictive pictures.