Field of the Invention
The present invention relates to a technique for estimating the state of an observable event using time-series filtering, and particularly to, for example, a technique for tracking objects in a moving image using time-series filtering.
Description of the Background Art
Techniques for estimating the internal state of an observation target, changing from moment to moment, may use time-series filtering. With a state vector xt indicating the internal state of an object at time t and an observation vector yt indicating a feature observed at time t, time-series filtering enables an internal state xt of a directly unobservable object to be estimated by using the observation vector yt obtained through observation.
More specifically, time-series filtering is a technique for determining a conditional probability distribution p(xt|y1:t) of a state series x0:t={x0, x1, . . . , xt} using the state space models below, when an observation series (a set of observation vectors before time t) y1:t={y1, y2, . . . , yt} is given.
System model: xt˜f(xt|xt−1)
Observation model: yt˜h(yt|xt)
With a system noise vt and an observation noise wt, the system model showing the internal state of an object and the observation model formed by observing the object can be expressed as follows:
the system model showing the internal state of an object: xt=f(xt−1, vt)
the observation model formed by observing the object: yt=h(xt, wt)
where f(xt−1, vt) is a state transition function indicating a change in the state between time t−1 and time t, and h(xt, wt) is an observation vector obtained in the state xt.
In this case, the one-step ahead predict is written as the formula below.p(xt|y1:t−1)=∫p(xt−1|y1:t−1)f(xt|xt−1)dxt−1  Formula 1Based on Bayes' Law, the posterior probability distribution p(xt|y1:t) at time t is written as the formula below.
                              p          ⁡                      (                                          x                t                            ⁢                              y                                  1                  :                  t                                                      )                          =                                            h              ⁡                              (                                                      y                    t                                    ❘                                      x                    t                                                  )                                      ⁢                          p              ⁡                              (                                                      x                    t                                    ❘                                      y                                          1                      :                                              t                        -                        1                                                                                            )                                                          p            ⁡                          (                                                y                  t                                ❘                                  y                                      1                    :                                          t                      -                      1                                                                                  )                                                          Formula        ⁢                                  ⁢        2            In this formula, h(yt|xt) is a likelihood (a probability for obtaining an observation vector yt in the state xt), and p(xt|y1:t−1) is a predictive probability distribution.
One practical example of time-series filtering is particle filtering. Particle filters represent the distribution of probabilities of the internal state of an observation target as the distribution of particles, and use the distribution of posterior probabilities of the state at the current time step as the distribution of prior probabilities of the state at the next time step. With particle filtering, the likelihood is calculated by comparing a template observation estimated from the state of particles indicating the distribution of prior probabilities (a set of samples generated in accordance with the prior probability distribution) (predictive samples) with an actual image (an actual observation) obtained at the next time step.
Particle filtering estimates the posterior probability distribution of particles from the calculated likelihoods and the prior probability distribution.
Particle filtering uses the above processing performed repeatedly at each subsequent time step to successively estimate the dynamically changing state of an observation target (e.g., a tracking target).
Particle filtering involves the processing (1) to (4) below, in which M is the number of particles (M is a natural number) and 1≦i≦M (i is an integer).
(1) Generating Particles (One-step Ahead Prediction)
For each sample (each particle), the processing corresponding to the formula below is performed to generate a predictive sample at time t. More specifically, the probability distribution predicted in accordance with the system model (state transition function) is obtained from the posterior probability distribution at time t−1 (the probability distribution of the internal state of an observation target at time t−1). In more detail, each predictive sample is generated from the corresponding sample (particle) at time t−1 through transition in accordance with the system model f.xat(i)˜f(xt|xt−1(i))xat={xat(1),xat(2),xat(3), . . . ,xat(M)}
where xat is a predictive (estimated) vector of a state vector xt calculated by a state transition function f( ).
(2) Calculating Weights (Calculating Likelihoods)
For each predictive sample generated in processing (1), the processing corresponding to the formula below is performed to calculate a weight (likelihood). More specifically, the probability (likelihood) to obtain the observation vector yt is estimated in accordance with the observation model h.wat(i)˜h(yt|xat(i))wat={wat(1),wat(2),wat(3), . . . ,wat(M)}
where wat is a predictive (estimated) vector of a weight (likelihood) wt (a set of predictive likelihoods) calculated by a function h( ).
(3) Resampling
At the ratio proportional to the weight (likelihood) wat(i), M particles are sampled without changing the total number of the particles (the particle xat(i) is sampled). The posterior probability distribution at time t (the probability distribution of the internal state of the observation target at time t) is obtained from the sampled M particles.
(4) The time t is incremented by one step, and the processing returns to (1). The posterior probability distribution obtained in processing (3) (the posterior probability distribution at time t) is used as the prior probability distribution at the next time step (time t+1).
As described above, particle filtering allows the estimation of parameters indicating the changing state of the observation target, changing from moment to moment, by repeatedly predicting the prior probability distribution of parameters indicating the state of the observation target and calculating the posterior probability distribution. Such particle filtering may be used in tracking the position of an object in a moving image. In tracking the position of an object with particle filtering, parameters indicating the position of an object may include parameters indicating the state of a tracking target (an example of an observation target). Particle filtering includes comparing observations estimated from parameters indicating the position of the object (predictive samples) with actual observations (e.g., an image captured by a camera) to calculate likelihoods, and resampling particles based on the calculated likelihoods to obtain the posterior probability distribution of parameters indicating the state of the observation target (see, for example, Patent Literature 1: Japanese Unexamined Patent Publication No. 2012-234466).
Techniques using particle filtering have been developed to track multiple objects (targets). For example, Non-patent Literature 1 describes a technique for tracking multiple objects (targets) with high accuracy using particle filters (N. Ikoma, H. Hasegawa, and Y. Haraguchi, “Multi-target tracking in video by SMC-PHD filter with elimination of other targets and state dependent multi-modal likelihoods,” Information Fusion (FUSION), 2013 16th International Conference, July 2013, pp. 588-595).
To track multiple objects with high accuracy, the technique described in Non-patent Literature 1 assigns a distinguishable set of particles to each different object to be tracked. More specifically, for example, particles for tracking an object with an object number 1 are each labeled with an integer indicating the object number 1, whereas particles for tracking an object with an object number 2 are each labeled with an integer indicating the object number 2. More specifically, with the technique described in Non-patent Literature 1, an object with an object number k (k is an integer) is tracked through the particle filtering process described above using particles labeled with an integer indicating the object number k. With the technique described in Non-patent Literature 1, each object being tracked is given an integer other than 0 as its object number.
The technique described in Non-patent Literature 1 adds particles having an object number 0, and performs particle filtering using the newly added particles with the object number 0 to detect a new object. In more detail, with the technique described in Non-patent Literature 1, an image resulting from erasing an area occupied by the object being tracked (the object being tracked by using particles each having an integer other than 0 as its object number) is generated. The generated image then undergoes particle filtering using particles with the object number 0 to detect a new object. This processing will now be described with reference to FIGS. 19A and 19B.
FIGS. 19A and 19B are diagrams describing the processing of tracking and detecting, for example, red objects through particle filtering from an extracted-feature image, or an image representing an extracted image feature quantity corresponding to redness extracted from a captured image. In more detail, FIG. 19A is a diagram describing the processing of tracking two objects (red objects) each having high redness by using particles. FIG. 19B is a diagram describing the processing for detecting a new object by using particles from an object-erased image, which is an image resulting from erasing image areas corresponding to the currently-tracked objects.
As shown in FIG. 19A, the technique described in Non-patent Literature 1 uses particles Prtc1(TG1) for tracking an object TG1 and particles Prtc1(TG2) for tracking an object TG2. In FIG. 19A, the particles are indicated by small dots. Particles (a set of particles) within an area Re1 are labeled with Prtc1(TG1). The particles Prtc1(TG1) are used to track the object TG1. In FIG. 19A, a set of particles within an area Re2 are labeled with Prtc1(TG2). The particles Prtc1(TG2) are used to track the object TG2.
With the technique described in Non-patent Literature 1, the area Re1 for detecting the object TG1 is determined at time t based on the distribution of particles Prtc1(TG1) shown in FIG. 19A (the posterior probability distribution at time t). Likewise, the area Re2 for detecting the object TG2 is determined based on the distribution of particles Prtc1(TG2) shown in FIG. 19A (the posterior probability distribution at time t). The areas Re1 and Re2 are then eliminated from the image to generate an object-erased image. The generated object-erased image then undergoes particle filtering using particles Prtc1(TG0) with the object number 0 to detect a new object.
In FIG. 19B, the particles Prtc1(TG0) for new object detection are within an area R1_rest, which is an area remaining after the area Re1 is erased from the area corresponding to the object TG1 in FIG. 19A. The technique described in Non-patent Literature 1 erroneously determines that the area R1_rest corresponds to a new object. In other words, when an area occupied by a currently-tracked object is erased incompletely, the technique described in Non-patent Literature 1 erroneously determines that the remaining part of the area corresponding to the currently-tracked object corresponds to a new object as shown in FIG. 19B. In the example shown in FIG. 19B, the area R1_rest, which is a part of the area occupied by the object TG1, is erroneously determined to correspond to a new object (a newly detected object).
In view of the above problems, it is an object of the present invention to provide a state estimation apparatus, a program, and an integrated circuit that allow appropriate estimation of the internal state of an observation target by calculating likelihoods from observation data, and enable appropriate tracking of multiple objects in a moving image, as well as appropriate detection of a new object and adding of the detected new object as a tracking target.