In many applications the transmission of uncompressed video is impractical due to limitations in the amount of bandwidth available and the expense related to use of the bandwidth. As a result, a more efficient way of transmitting video is to compress the video prior to transmission. Advantageously, because video material is highly redundant, it can be efficiently compressed using a conventional video coding standard, such as, for example, H.264/JVT/AVC/MPEG-4 part 10.
This compression standard achieves approximately 50% better compression performance as compared with the previous state of the art compression standards, such as, for example, MPEG-4 or H.263. However, even with these gains, better compression is still desired. For example, a cable or satellite company with certain available bandwidth may offer customers more video channels if better compression techniques were available which provided high quality transmissions.
One of the approaches for improving video compression, first introduced in H.263, is the Reduced Resolution Update (RRU) technique. In this mode, some of the video frame's residual data is coded at a reduced resolution, thus substantially reducing the amount of data that is transmitted in each frame. However, disadvantageously, because the entire frame residual must be coded at either full or reduced resolution (i.e., affecting all inter and intra macroblocks), the decision to employ the reduced resolution mode is made on a per-frame basis. As such, although substantial bit savings may only be achieved for some frames, no bit savings is achieved for others.
Another important technique used to reduce transmission bit rate is motion estimation. One conventional motion estimation method, H.264, is restricted to a simple translation model, which fails if the scene has complex motion. However, in many practical applications, the motion field may vary greatly between different scenes, such as a 3D scene or dynamic scene. As such, it is extremely difficult to use one or two models to describe the general scenes.
Using conventional variational approaches to estimating optical flow between two frames results in an over-smoothed estimation wherein the discontinuities and occlusion areas between different motion fields (or layers) are not distinguished, even with an anisotropic diffusion operator. As a result, it is difficult to generate a high-quality disposable frame using the estimated flow field. Due to the aperture problem of optical flow estimation, the larger region of integration is more preferable to produce stable motion estimation while it is also more likely to contain multiple motions in this region, which may introduce serious error.
According to an exemplary conventional variational model for optical flow estimation, a standard brightness constancy assumption may be applied, wherein the image brightness of a pixel at x=[x y]T is independent of the motion vector u=[u v]T, such thatI1(x)=I2(x+u)See e.g., Horn, B., et al. “Determining optical flow,” Artificial Intelligence, Volume 17, pages 185-203 (1981); and Lucas, B., et al. “An iterative image registration technique with an application to stereo vision,” International Joint Conference on Artificial Intelligence, pages 674-679 (1981).
Accordingly, the optical flow is estimated by minimizing the following data energy function:
                    E        d            ⁡              (        u        )              =                  ∫        Ω            ⁢                                    (                                                            I                  1                                ⁡                                  (                  x                  )                                            -                                                I                  2                                ⁡                                  (                                      x                    +                    u                                    )                                                      )                    2                ⁢                                  ⁢                  ⅆ          x                      ,where Ω is the spatial image domain. Using Taylor expansion, the above equation may be approximated by the first order terms as:
            E      d        ⁡          (      u      )        =            ∫      Ω        ⁢                            (                                                                      (                                      ∇                    I                                    )                                T                            ⁢              u                        +                          I              t                                )                2            ⁢                          ⁢              ⅆ        x            where ∇I is spatial image gradient, and It is temporal image gradient.
In order to address aperture problems and suppress noise during flow estimation, an edge-preserving anisotropic smoothness term is added to the energy function such that:
            E      ⁡              (        u        )              =                                        E            d                    ⁡                      (            u            )                          +                              E            s                    ⁡                      (            u            )                              =                                    ∫            Ω                    ⁢                                                    (                                                                                                    (                                                  ∇                          I                                                )                                            T                                        ⁢                    u                                    +                                      I                    t                                                  )                            2                        ⁢                                                  ⁢                          ⅆ              x                                      +                              ∫            Ω                    ⁢                                    ∇                              u                T                                      ⁢                          D              ⁡                              (                                  ∇                                      I                    1                                                  )                                      ⁢                          ∇              u                        ⁢                                                  ⁢                          ⅆ              x                                            ,where the anisotropic diffusion tensor D(∇I1) is a defined by
            D      ⁡              (                  ∇                      I            1                          )              =                  1                                                                          ∇                                  I                  1                                                                    2                    +                      2            ⁢                          ν              2                                          ⁢              (                                                            [                                                                                                                              ∂                                                      I                            1                                                                                                    ∂                                                      y                            1                                                                                                                                                                                                  -                                                                              ∂                                                          I                              1                                                                                                            ∂                                                          x                              1                                                                                                                                                                          ]                            ⁡                              [                                                                                                    -                                                                              ∂                                                          I                              1                                                                                                            ∂                                                          x                              1                                                                                                                                                                                                                                                        ∂                                                      I                            1                                                                                                    ∂                                                      y                            1                                                                                                                                              ]                                      T                    +                                    ν              2                        ⁢            1                          )              ,where 1 is the identity matrix, andwhere ν is a parameter to control the degree of isotropy smoothness. To minimize the energy functional, the partial derivative of
      ∂    E        ∂    u  may be computed and the flow field may be iteratively updated, according to the following expression:
            ∂      E              ∂      u        =                              ∂                      E            d                                    ∂          u                    +                        ∂                      E            s                                    ∂          u                      =                            (                                                    ∇                                  I                  T                                            ⁢              u                        +                          I              t                                )                ⁢                  ∇                      I            T                              +              div        ⁡                  (                                    D              ⁡                              (                                  ∇                                      I                    1                                                  )                                      ⁢                          ∇              u                                )                    
Therefore, the flow field, uτ, at the iteration step τ, may be updated by the following expression:
            ∂      u              ∂      τ        =                    u        τ            -              u                  τ          -          1                      =          -                        ∂          E                          ∂          u                    
Instead of using one-step update of uτ, the update may be separated into a two-step process, such that
                    u                  τ          ′                    -              u                  τ          -          1                      =                  -                              ∂                          E              d                                            ∂            u                              =                        -                      (                                                            ∇                                      I                    T                                                  ⁢                                  u                                      τ                    -                    1                                                              +                              I                t                                      )                          ⁢                  ∇                      I            T                                ,                    and        ⁢                                  ⁢                  u          τ                    -              u                  τ          ′                      =                  -                              ∂                          E              s                                            ∂            u                              =              -                  div          ⁡                      (                          D              ⁢                              (                                  ∇                                      I                    1                                                  )                            ⁢                              ∇                u                                      )                              where the second step can be substituted by an oriented Gaussian convolution such thatuτ=uτ′*G(T,Δτ),where
            G      ⁡              (                  T          ,          Δτ                )              =                  1                  4          ⁢          πΔτ                    ⁢              exp        (                  -                                                    x                T                            ⁢                              T                                  -                  1                                            ⁢              x                                      4              ⁢              Δτ                                      )              ;where T is the structure tensor, such that T=ληηηT+λξξξT and λη, and λξ are eigenvalues of diffusion tensor D(∇I1), η and ξ are the corresponding orthogonal eigenvectors.
However, when motion displacement is more than one pixel, the minimization using the above described conventional variational framework may be trapped in a local minimum.
One significant problem with the conventional variational model is that it is adapted to minimize the squared intensity error or data energy for every pixel, regardless if the pixel is occluded or not. As a result, the warped image, I2(x+u), performs incorrect deformation to fill the occluded area of frame I1(x) even though no corresponding pixel at I2 can match the occluded pixel x at the first frame. However, when there is a large occlusion between two images, this minimization produces serious distortion and/or dragging. For example, if there is a large motion difference between two neighboring regions, the weak-textured regions are dragged to follow the movement of the high-gradient region boundaries.
According to an example, if the camera has apparent zooming or pan then a larger number of pixels should be detected as occluded at the image boundary. However, according to the conventional model, the energy of those pixels is minimized, thus causing significant distortion along the image boundary.
Furthermore, even though several different anisotropic smoothness terms may be introduced into the energy function in the conventional methods and algorithms, it is still difficult to obtain highly discontinuous flow field due to the unclear occlusion process. (See, e.g., Alvarez, L., et al., “Symmetrical dense optical flow estimation with occlusion detection,” European Conference on Computer Vision (2002); Strecha, C., et al. “A probabilistic approach to large displacement optical flow and occlusion detection,” Workshop on Statistical Methods in Video Processing.(2004); Pemoa, P., et al. “Scale-space and edge detection using anisotropic diffusion,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Volume 12, pages 629-639 (1990); Black, M., et al. “Robust anisotropic diffusion,” IEEE Trans. on Image Processing Volume 7, pages 421-432 (1998); and Tschumperle, D., et al. “Vector-valued image regularization with pde's: A common framework for different applications,” Computer Vision and Pattern Recognition (2003).
Specifically, many conventional methods fail to address the occlusion problem of the optical estimation (See e.g., Horn, B., et al. “Determining optical flow,” Artificial Intelligence Volume 17 pages 185-203 (1981); Lucas, B., et al. “An iterative image registration technique with an application to stereo vision,” International Joint Conference on Artificial Intelligence, pages 674-679 (1981); Brox, T., et al., “High accuracy optical flow estimation based on a theory for warping,” European Conference on Computer Vision, pages 25-36 (2004); Deriche, R., et al., “Optical-flow estimation while preserving its discontinuities: a variational approach,” Asian Conference on Computer Vision, pages 290-2955 (1995); Barron, J., et al., “Performance of optical flow techniques,” International Journal of Computer Vision, Volume 12, pages 43-77 (1994); McCane, B., et al. “On benchmarking optical flow,” Computer Vision and Image Understanding, Volume 84, pages 126-143 (2001); and Weickert, J., et al., “Variational optic flow computation with a spatio-temporal smoothness constraint,” Journal of Mathematical Imaging and Vision, Volume 14, pages 245-255 (2001).
As such, there is a need in the art for an improved optical flow estimation method which includes an efficient anisotropic smoothness constraint such that it not only maintains piecewise spatial coherence but also maintains accurate flow discontinuities over the motion boundaries.