1. Field
The embodiments relate to motion estimation, and more particularly to a two-dimensional convolution engine also used for motion estimation.
2. Description of the Related Art
Motion estimation (ME) is typically the most computationally demanding part of video compression. Video post-processing, such as motion-compensated filtering and deinterlacing, require reliable ME. One of the most widely used algorithms for ME is block matching, by which rectangular windows, for example N×N blocks, are matched against a reference frame (or field). The matching criterion is usually the sum of absolute errors for a particular displacement (m,n), defined as
                                          SAE            ⁡                          (                              m                ,                n                            )                                =                                                    ∑                k1                N                            ⁢                                                          ⁢                                                ∑                  k2                  N                                ⁢                                                                  ⁢                                  t                  ⁡                                      (                                                                  k                        ⁢                                                                                                  ⁢                        1                                            ,                                              k                        ⁢                                                                                                  ⁢                        2                                                              )                                                                        -                          w              ⁡                              (                                                      k1                    -                    m                                    ,                                                            k                      ⁢                                                                                          ⁢                      2                                        -                    n                                                  )                                                    ,                                  ⁢                  0          ≤                      (                                          k                ⁢                                                                  ⁢                1                            ,                              k                ⁢                                                                  ⁢                2                                      )                    ≤          N                                                where t and w are the target and window (reference) frames respectively. Video encoders or processors typically have specialized accelerators that compute the SAE very quickly. If the search area is an LH×LV region, the engine finds the (m,n) displacement pair with minimum SAE within that region. Matching may also be performed with a mean-squared error criterion for the N×N block of pixels, defined as
            MSE      ⁡              (                  m          ,          n                )              =                  1        /                  N          2                    ⁢                        ∑          k1          N                ⁢                                  ⁢                              ∑            k2            N                    ⁢                                          ⁢                                    [                                                t                  ⁡                                      (                                                                  k                        ⁢                                                                                                  ⁢                        1                                            ,                                              k                        ⁢                                                                                                  ⁢                        2                                                              )                                                  -                                  w                  ⁡                                      (                                                                                            k                          ⁢                                                                                                          ⁢                          1                                                -                        m                                            ,                                                                        k                          ⁢                                                                                                          ⁢                          2                                                -                        n                                                              )                                                              ]                        2                                ,          ⁢      0    ≤          (                        k          ⁢                                          ⁢          1                ,                  k          ⁢                                          ⁢          2                    )        ≤    N  but this is more computationally complex due to the squaring operation.
Other stages of encoding or post-processing also require two-dimensional (2-D) convolution for spatial filtering for noise reduction, or for band-limiting prior to decimation. These filters also typically require dedicated hardware optimized for high-performance operation. The 2-D filter or convolver usually includes a bank of multipliers with filter coefficients and a memory buffer for data.