1. Field of the Invention
The present invention relates to object tracking systems and methods. More particularly, the present invention relates to object tracking systems and methods capable of identifying one or more objects from images and tracking the movement of the object(s).
2. Background
Object tracking general refers to the technique of identifying one or more objects in an image or a series of images, including a video sequence, for various purposes. As an example, object tracking can apply to security, surveillance, and personnel, process, or production management applications. Typically, tracking methods can be divided into two main classes—bottom-up and top-down approaches. Under a bottom-up approach, an image is segmented into objects, which are used for object tracking. In contrast, a top-down approach generates object hypotheses and tries to verify them using the image contents. Mean-shift and particle filter are two common object tracking methods using the top-down approach.
In many applications, object representation may become an important part for an object tracking process. For example, a feature space, such as color histograms, edges or contour, may be chosen to describe a target, which typically may come from the first image of a series of images or a video. A color histogram can represent a target for object tracking as it achieves robustness against non-rigidity, rotation, and partial occlusion. In some examples, an elliptical area may be used as a tracking area, which may surround an object to be tracked. In some cases, to reduce computational complexity during a real-time processing, m-bin histograms may be used. In one example, the color histogram distribution p(y) at location y inside an elliptic region may be determined by the following:
                                          p            ⁡                          (              y              )                                =                                    {                                                p                  u                                ⁡                                  (                  y                  )                                            }                                                      u                =                1                            ,              …              ,              m                                      ,                            (        1        )                                                                    p              u                        ⁡                          (              y              )                                =                                    C              h                        ⁢                                          ∑                                  i                  =                  1                                                  n                  h                                            ⁢                                                k                  ⁡                                      (                                                                                                                                                y                            -                                                          x                              i                                                                                h                                                                                            2                                        )                                                  ⁢                                  δ                  ⁡                                      [                                                                  h                        ⁡                                                  (                                                      x                            i                                                    )                                                                    -                      u                                        ]                                                                                      ,                            (        2        )            where nh represents the number of pixels in the region and δ denotes the Kronecker delta function. The parameter h is used to adapt the size of the region. The normalization factor
            C      h        ⁡          (                        ∑                      i            =            1                                n            h                          ⁢                  k          ⁡                      (                                                                                                y                    -                                          x                      i                                                        h                                                            2                        )                              )            -    1  ensures that
            ∑              u        =        1                    n        h              ⁢                  ⁢                  p        u            ⁡              (        y        )              =      1    ⁢                  ⁢    and  and
      k    ⁡          (      r      )        =      {                                                      1              -                              r                2                                                                        r              <              1                                                            0                                otherwise                              .      To increase the reliability of the color distribution, smaller weights may be assigned to the pixels that are further away from the ellipse center as in Eq.(2).
A similarity function may define or identify the similarity between two targets. As an example, the Bhattacharyya distance is a similarity function used to measure the similarity between two color histogram probability distributions. It can be expressed:
                                          d            ⁡                          (                              p                ,                q                            )                                =                                    1              -                              ρ                ⁡                                  [                                      p                    ,                    q                                    ]                                                                    ,                                  ⁢                              ρ            ⁡                          (                              p                ,                q                            )                                =                                    ∑                              u                =                1                            m                        ⁢                                                  ⁢                                                            p                  u                                ,                                  q                  u                                                                    ,                            (        3        )            where d(·) is the Bhattacharyya distance, ρ(·) is the Bhattacharyya parameter, m is the number of bins, and pu and qu respectively represent u-bin histogram probabilities of a candidate target and an initial target model.
Mean shift is generally a recursive object tracking method. To locate an object in each frame, mean shift starts from the position of the tracking result in the previous frame and then follows a direction of increasing similarity function to identify the next recursion starting point. Recursion usually terminates when the gradient value approaches or becomes zero, with the point of termination as the tracking result, i.e. the new location of the object being tracked. The steps identified below illustrate an example of an iterative procedure of mean shift tracking method.
Given the target model {qu}u=1...m and its location y0 in the previous frame.1.Initialize the location of the target in current frame with y0.2.Caculate the weight according to Eq. (4).3.Find the next location y1 of the target candidate according to Eq. (5).4.If ∥y1 − y0∥ < ε, stop; else set y0 = y1 and go to step 2.
Under such approach, color histograms may be used to characterize a target and a Bhattacharyya distance function may be used to measure the similarity between two distributions. A target candidate most similar to the initial target model should have the smallest distance value. Minimizing the Bhattacharyya distance d=(1−ρ(y))0.5 is equivalent to maximizing the Bhattacharyya coefficient ρ(y). Using Taylor expression around the value pu(y0), the linear approximation of the Bhattacharyya coefficient is obtained as:
                              ρ          ⁡                      [                                          p                ⁡                                  (                  y                  )                                            ,              q                        ]                          ≈                                            1              2                        ⁢                                          ∑                                  u                  =                  1                                m                            ⁢                                                                                          p                      u                                        ⁡                                          (                                              y                        0                                            )                                                        ⁢                                      q                    u                                                                                +                                    1              2                        ⁢                                          ∑                                  u                  =                  1                                m                            ⁢                                                                    p                    u                                    ⁡                                      (                    y                    )                                                  ⁢                                                                                                    q                        u                                                                                              p                          u                                                ⁡                                                  (                                                      y                            0                                                    )                                                                                                      .                                                                                        (        4        )            Apply Bayes rule to Eq.(4) may lead to the following equation:
                                          ρ            ⁡                          [                                                p                  ⁡                                      (                    y                    )                                                  ,                q                            ]                                ≈                                                    1                2                            ⁢                                                ∑                                      u                    =                    1                                    m                                ⁢                                                                                                    p                        u                                            ⁡                                              (                                                  y                          0                                                )                                                              ⁢                                          q                      u                                                                                            +                                                            C                  h                                2                            ⁢                                                ∑                                      i                    =                    1                                                        n                    h                                                  ⁢                                                      w                    i                                    ⁢                                      k                    ⁡                                          (                                                                                                                                                            y                              -                                                              x                                i                                                                                      h                                                                                                    2                                            )                                                                                                          ,                            (        5        )                                          where          ⁢                                          ⁢                      w            i                          =                              ∑                          u              =              1                        m                    ⁢                                                                      q                  n                                                                      p                    u                                    ⁡                                      (                                          y                      0                                        )                                                                        ⁢                                          δ                ⁡                                  [                                                            b                      ⁡                                              (                                                  x                          i                                                )                                                              -                    u                                    ]                                            .                                                          (        6        )            
To minimize the distance, the second term may be maximized, with the first term being independent of y. The kernel is recursively moved from the current location y0 to the new location y1 according to the relation:
                                          y            1                    =                                                    ∑                                  i                  =                  1                                                  n                  h                                            ⁢                                                x                  i                                ⁢                                  w                  i                                ⁢                                  g                  ⁡                                      (                                                                                                                                                                              y                              0                                                        -                                                          x                              i                                                                                h                                                                                            2                                        )                                                                                                      ∑                                  i                  =                  1                                                  n                  h                                            ⁢                                                w                  i                                ⁢                                  g                  ⁡                                      (                                                                                                                                                                              y                              0                                                        -                                                          x                              i                                                                                h                                                                                            2                                        )                                                                                      ,                            (        7        )            
where g(x)=−k(x). The definitions of the these equations are illustrated by D. Comaniciu et al. in “Kernel-based object tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-577, May 2003. Mean shift is a recursive method and the recursive time for each tracking process is usually small. However, the initial state of each process is based on the last tracking result. Under certain conditions, the approach may cause error propagation, especially when the previous tracking result is not correct or accurate.
Particle filter technique represents a different approach. As an example, the technique may involve choosing new target candidates from the previous target candidates based on their weights in the preceding frame. Target candidates with high weights may be repeatedly selected so that a candidate with a higher weight may be chosen more than one time. Additionally, those new target candidates are updated with some feature vectors to ensure that they would be more similar to the initial target model and to give them suitable weights according to their similarity to the initial target model. Finally, the tracking result usually includes the target candidates and their weights, which would be used in next frame for choosing new target candidates.
Assume that xt represents the modeled object at time t and the vector Xt={x1, . . . , xt} is the history of the modeled object. In the same way, zt is the set of image features at time t and the history set of image features is Zt={z1, . . . , zt}. Observations zt are assumed to be independent, both mutually and with respect to the dynamical process. This may be expressed probabilistically as follows:
                              p          ⁡                      (                                          Z                                  t                  -                  1                                            ,                                                x                  t                                |                                  X                                      t                    -                    1                                                                        )                          =                              p            ⁡                          (                                                x                  t                                |                                  X                                      t                    -                    1                                                              )                                ⁢                                    ∏                              i                =                1                                            t                -                1                                      ⁢                                                  ⁢                                          p                ⁡                                  (                                                            z                      i                                        |                                          x                      i                                                        )                                            .                                                          (        8        )            The conditional state-density pt at time t may be:pt(xt)≡p(xt|Zt).  (9)Apply Bayes rule to Eq.(9) may lead to the following equation:p(x|z)=kp(z|x)p(x).  (10)
In one example, because the probability p(z|x) is sufficiently complex so p(z|x) cannot be evaluated simply in a closed form, iterative sampling techniques may be used. We generate a random variant x from a distribution p(x) that approximates the posterior p(z|x). First, a sample-set {s1, . . . , sn} is generated from the prior density p(x) with probability πi, where
                              π          i                =                                                                              p                  z                                ⁡                                  (                                      s                    i                                    )                                                                              ∑                                      j                    =                    1                                    N                                ⁢                                                      p                    z                                    ⁡                                      (                                          s                      j                                        )                                                                        ⁢                                                  ⁢            and            ⁢                                                  ⁢                                          p                z                            ⁡                              (                x                )                                              =                                    p              ⁡                              (                                  z                  |                  x                                )                                      .                                              (        11        )            
The value xi chosen in this fashion has a distribution which approximates the posterior p(x|z) increasingly accurately as N increase. The steps identified below illustrate an example of an iterative procedure of a particle filter approach. A similar example is described by K. Nummiaro et al. in “An adaptive color-based particle filter,” Image and Vision Computing, vol. 21, pp. 99-110, 2003.
Given the sample set St-1 and initial object model.1.Select N samples from the set St-1 with weight πt-1 :(a) Calculate the nomalized cumulative probabilities Ct-11, letCt-10 = 0 and Ct-1n = Ct-1(n-1) + πt-1(n).(b) Generate a random number r ε [0,1].(c) Find the smallest j for which Ct-1j > r.(d) Set st(n) = st-1j.2.Update target candidate states with some feature vectors.3.Give suitable weight for new candidate according to the similaritybetween initial target model and candidate.       4.    ⁢                  ⁢    Estimate    ⁢                  ⁢    the    ⁢                  ⁢    mean    ⁢                  ⁢    state    ⁢                  ⁢    of    ⁢                  ⁢    the    ⁢                  ⁢    set    ⁢                  ⁢          S      t        ,          ⁢            E      ⁡              [                  S          t                ]              =                  ∑                  n          =          1                N            ⁢                          ⁢                        π          t                      (            n            )                          ⁢                              s            t                          (              n              )                                .                    
Compared with the mean shift technique, the tracking results of a particle filter technique are updated during tracking process based on the target candidates instead of the last tracking results. In general, particle filter technique may present a more robust object tracking method when many target candidates are used. However, depending on the implementation, it may increase the computational complexity and require a tradeoff between efficiency and accuracy.
A hybrid tracker technique combining mean shift and particle filter was also proposed. The first step of this technique is to generate target candidates and re-sample these candidates. The second step applies mean shift technique independently to each target candidate until all target candidates are stabilized. The third step recalculates the weight for each target candidate using Bhattacharyya distance. Finally, the average is calculated to obtain tracking result. Because all target candidates are stabilized, the number of target candidates could be reduced without losing accuracy.