The present invention relates to a method and a system for tracking and/or prediction of a nonlinear dynamic process (e.g., an incompletely modeled nonlinear dynamic process) in real time.
Many industries require a system to estimate the present and future state of a random dynamic signal, based upon corrupted, distorted, and possibly partial observations of the signal. For example, one may be interested in the real-time illumination of a stage performer based only on one-dimensional distance measurements produced by ultrasonic wave timings from perimeter speakers. Due to the inherent mechanical and physical time lags associated with robotic lighting, such applications require computer implementable tracking and prediction solutions to schedule lighting movements that are coordinated with where the performer will be some time in the future.
Furthermore, the one-dimensional acoustic observations are partial, are distorted by sampling, and are corrupted by the reflection of the ultrasonic waves off objects such as props on stage. While perfect determination of the signal's state is impossible under these circumstances, it is desirable to obtain probabilistic estimates of the current or future state, conditioned on the information that is available. More precisely, in one exemplary embodiment, the problem may be stated as follows.
The signal X to be tracked, possibly described by an Itô or Skorohod stochastic differential equation (SDE), is a Markov process defined on some probability space (Ω, F, P), and living within a bounded d-dimensional domain such as the rectangular domain D=[0,L0]×[0,L1]× . . . ×[0,Ld-1], where Li, 0≦i≦d−1 are the lengths of all dimensions. In the case of the signal's dynamics being modeled by a time-homogeneous Itô SDE, X evolves according todXt=A(Xt)dt+B(Xt)dνt,  (1)where Xt is the signal state at time t and νt is a standard d-dimensional Brownian motion. A and B might have been chosen in a manner to ensure that Xt stays in D. Otherwise, one can just track X until it leaves D or replace equation 1 with a Skorohod SDE that corresponds to 1 with boundary reflections. Let a(x)=B(x)B*(x) be the diffusion matrix with elements {ai,j(x)}. The diffusion operator (L, D(L)) for the stochastic equation defined above is
                    L        =                                            1              2                        ⁢                                          ∑                                  i                  ,                                      j                    =                    0                                                                    d                  -                  1                                            ⁢                                                                    a                                          i                      ,                      j                                                        ⁡                                      (                    x                    )                                                  ⁢                                                      ∂                    2                                                                              ∂                                              x                        i                                                              ⁢                                          ∂                                              x                        j                                                                                                                          +                                    ∑                              j                =                0                                            d                -                1                                      ⁢                                                            A                  j                                ⁡                                  (                  x                  )                                            ⁢                              ∂                                  ∂                                      x                    j                                                                                                          (        2        )            and the associated adjoint operator is given by
                              L          *                =                                            1              2                        ⁢                                          ∑                                  i                  ,                                      j                    =                    0                                                                    d                  -                  1                                            ⁢                                                                    a                                          i                      ,                      j                                                        ⁡                                      (                    x                    )                                                  ⁢                                                      ∂                    2                                                                              ∂                                              x                        i                                                              ⁢                                          ∂                                              x                        j                                                                                                                          +                                    ∑                              j                =                0                                            d                -                1                                      ⁢                                                            τ                  j                                ⁡                                  (                  x                  )                                            ⁢                              ∂                                  ∂                                      x                    j                                                                                +                      v            ⁡                          (              x              )                                                          (        3        )            where
            τ      j        ⁡          (      x      )        =                              ∑                      i            =            0                                d            -            1                          ⁢                              ∂                                          a                                  i                  ,                  j                                            ⁡                              (                x                )                                                          ∂                          x              i                                          -                                    A            j                    ⁡                      (            x            )                          ⁢                                  ⁢        and        ⁢                                  ⁢                  v          ⁡                      (            x            )                                =                            1          2                ⁢                              ∑                          i              ,                              j                =                0                                                    d              -              1                                ⁢                                                    ∂                2                            ⁢                                                a                                      i                    ,                    j                                                  ⁡                                  (                  x                  )                                                                                    ∂                                  x                  i                                            ⁢                              ∂                                  x                  j                                                                        -                        ∑                      j            =            0                                d            -            1                          ⁢                                            ∂                                                A                  j                                ⁡                                  (                  x                  )                                                                    ∂                              x                j                                              .                    
Let ε>0 and set tm:=mε for m=0,1,2, . . . . For some function H, the discrete sequence of observations Ytm occurring at times tm for signal Xtm can be described stochastically byYtm=H(Xtm,Wtm),  (4)where Wtm is a sequence of random variables that model the sensor noise input.
In the specific case for additive zero mean, σ2 variance Gaussian noise, the above general equation (4) can be interpreted byYtm=h(Xtm)+σΔWtm,  (5)where h is taken to be the sensor function and ΔWtm=Wtm−Wtm-1, may be the increment of a standard Brownian motion. We define Yt=σ{Ytm, tm≦t} to be the information observed up to time t.
The requirement is to construct a device which can efficiently approximate the conditional distribution of the state of the signal given all observations up to the current time, that is to estimateP(Xt∈A|Yt).  (6)Such a device is generally known as an optimal tracking distribution, whose mean is the least squares estimate of the signal state Xt given all the discrete observations up to time t. Also, the device should provide asymptotically optimal predictors to future signal states, computed through refining approximations toP(Xt+κ∈A|Yt),  (7)which is the conditional probability of the signal state Xt+κ at κ>0 time units into the future given the discrete observations up to time t.“Use of Particle System Methods”
In the specific finite-dimensional Itô equation (1) with A affine and B constant and linear observations, an efficient, recursive, and optimal tracker is already known: the Kalman filter. Recursive, in this context, means that the computation involved in providing the state estimate after, say two observations, directly utilizes the result of the state estimate after the first observation while processing the newly arrived second observation. Not only does the Kalman filter have very stringent requirements on the possible signal dynamics and on the form of the observation noise, there are practical computational problems such as matrix instability and brittleness. In more general environments, which include the class of problems with non-linear signal dynamics and observation models with non-additive Gaussian noise (equation 4), the Kalman filter and its derivatives (e.g., an extended Kalman filter and an interacting multiple model tracker) are suboptimal. An interacting multiple model tracker runs multiple Kalman filters in banks, each making different assumptions on the signal dynamics, then weighting the multiple outputs to provide a tracking estimate.
In response to the shortcomings of the Kalman filter, other techniques have been developed and have expanded the class of solvable filtering problems. These techniques provide readily implementable computer algorithms that are efficient and provide asymptotically optimal solutions. For example, the class of nonlinear particle system methods is a recent approach used to solving filtering problems.
Particle approximations have the desirable feature of replacing storage of conditional distributions for the filtering equations with particle locations. Whereas it is difficult, or impossible, to store the densities for higher dimensional problems in a linear interpolative manner, it is relatively easy to store particle locations. Then, the empirical measures of properly designed interacting particle systems provide asymptotic approximations to the conditional densities of the filtering equations. Monte Carlo-type methods are used to propagate the particles instead of using infinite dimensional equation solvers. Hence, a great reduction in computational complexity can occur. The various particle methods perform quite differently under the realistic condition of a finite number of particles.
The basic requirements for a particle method are the simulation of independent samples of the signal, called particles, with the same stochastic law as the signal, and the re-sampling of these particles to incorporate observation information. In the case where the signal solves a stochastic differential equation, the simulation of each particle's SDE is usually implemented with discrete time Euler or Miltsein approximations. The set of particles are then adjusted in some manner to conform to each observation by either assigning each particle a scalar weight value, by interchanging the state values associated with each particle in some manner, or by cautious branching techniques, as described in U.S. Pat. No. 5,933,352 issued Aug. 3, 1999 to Salut, entitled “Method and a system for non-linear optimal estimation of dynamic processes in real time,” and in the U.S. Patent Application Publication US2002/0198681 A1, published Dec. 26, 2002 to Kouritzin et al., entitled “Flexible efficient branching particle tracking algorithms.”
One method described in U.S. Pat. No. 5,933,352 adjusts all N particles ξtj,j=0,1,2, . . . ,N−1 to match all the observations up to time tm by calculating a scalar weight value Wm for each particle and then modifying each particle's cumulative weight by the production {tilde over (W)}m={tilde over (W)}m−m×Wm. This method has two sub-methods. In one sub-method, the influence of past observations on the particle weights is attenuated in time, and in the other sub-method the influence of past observation on the weights is strictly limited in time. The conditional distribution approximations for particle ξj with weight {tilde over (W)}m after the mth observation are
                                          P            ⁡                          (                                                                                          X                                              t                        m                                                              ∈                    A                                    ❘                                      Y                                          t                      0                                                                      ,                                  Y                                      t                    1                                                  ,                …                ⁢                                                                  ,                                  Y                                      t                    m                                                              )                                ≈                                    1                                                ∑                                      j                    =                    0                                                        N                    -                    1                                                  ⁢                                                                            W                      ~                                        m                                    ⁡                                      (                                          ξ                      j                                        )                                                                        ⁢                                          ∑                                  j                  =                  0                                                  N                  -                  1                                            ⁢                                                                                          W                      ~                                        m                                    ⁡                                      (                                          ξ                      j                                        )                                                  ⁢                                  1                                                            ξ                                              t                        m                                            j                                        ∈                    A                                                                                      ,                            (        8        )            where N is the total number of particles at time tm. The resulting weighted particle distribution can be made to converge to the optimal conditional distribution of the signal, given the observations, as the number of particles, N, is increased. The predicted distribution estimate for some future time p>tm, conditioned on all the observations up to time tm, is obtained by simulating the particles forward in time without any weight adjustments.
Another method described in U.S. Pat. No. 5,933,352 adjusts its particles ξtj,j=0,1,2, . . . ,N−1 after each observation by redistributing some or all of their state data. This random redistribution through interacting particles is conducted such that particles with a higher weight are more likely to have their state data overwrite the state data of other particles with lower weight. In this manner, most particles are preferentially collected around the areas of the state space which have the highest conditional probability of containing the signal. While U.S. Pat. No. 5,933,352 does not specifically state this, it can be assumed that the intention is for all particles that are involved in this redistribution to have their weights set to the average of the weights of all involved particles after the redistribution is complete, since this is the condition which will retain the asymptotic optimality of the system.
As U.S. Pat. No. 5,933,352 states, simple particle weighting without redistribution causes degeneration in the output conditional distribution that can be rectified only to some degree by attenuating or limiting the effect of older observations. These actions themselves entail an unavoidable loss of asymptotic optimality in increasing numbers of particles for the filter. If only some particles are redistributed, there are still some degenerative effects working against optimality, and the calculation of the average weight for a subset of the particles is cumbersome. However, a great amount of computation is involved in redistributing all of the particle states at each time step. This effort is especially pronounced if the particle data must travel between processing units in a distributed architecture. Moreover, redistributing all of the particles induces undue extra randomness that impairs path space capabilities and downgrades performance.
Another particle filter was described in U.S. Patent Application Publication US2002/0198681 A1 to Kouritzin et al. This exemplary filter described therein may be referred to as the MITACS-PINTS Branching (MIBR) Particle Filter. The MIBR filter provides a more controlled method for resampling, which improves both performance and computer efficiency. Instead of moving the particles after each observation, the MIBR method probabilistically adjusts its particles ξtj,j=0,1,2, . . . ,N−1 after each observation by selectively duplicating or removing the particles in accordance with a branching value criteria. Generically, most particles are neither duplicated nor removed but rather left alone after an observation. The branching value is similar to the particle weights in U.S. Pat. No. 5,933,352, but 1 is subtracted from Wm in order to utilize the following routine:                1. If Wm(ξj)≧0, then the MIBR copies particle ξj└Wm(ξj)┘ times. It will then add one more particle with probability Wm(ξj)−└Wm(ξj)┘.        2. If Wm(ξj)<0, then that particle is eliminated with probability |Wm(ξj)|.        3. Once this resampling has occurred, an efficient unbiased particle control routine is run to return the number of particles to the number of particles prior to resampling. After each resampling procedure, the weights of all particles are reset to a value of 1.This MIBR method is advantageous over the two methods described in U.S. Pat. No. 5,933,352 by ensuring that, on average, the right amount of resampling takes place.        
As with all the particle system methods mentioned above, the adjustments of particles due to newly arrived observations are treated individually, on a particle-by-particle basis.
A very different class of filters are characterized by signal space discretizations wherein particles are not independent copies of the signal, but rather represent a small mass of the conditional distribution. Furthermore, observation dependent treatments are not performed via particle-by-particle adjustments, but rather adjustments on a group of particles. Partial differential equations are commonly solved using this space discretization approach, such as finite element and multi-grid methods.
The discrete analog for a stochastic partial differential equation would be some Markov chain approximation. For filtering, there is a distinct choice: One can discretize the signal as described in Kushner, H. J., “A robust discrete state approximation to the optimal nonlinear filter for a diffusion,” Stochastics, 3, pp.75–83 (1979), as well as in Di Masi, G. B. and Runggaldier, W. J., “Continuous-time approximations for the nonlinear filtering problem,” Appl. Math. And Opt., 7, pp.233–245 (1981), or directly approximate the Duncan-Mortensen-Zakai (DMZ) and Fujisaki-Kallianpur-Kunita (FKK) equations. In the former case, robustness results can be used to ensure that the filtering solution with the approximate signal is not too different from the desired solution and any of the Kallianpur-Striebel formula, the Hidden Markov Model techniques, or the Gauge Transform methods can be used to calculate the approximate filter solution as described in Bhatt, A. G., Kallianpur, G., and Karandikar, R. L., “Robustness of the nonlinear filter,” Stochastic Process. Appl., 81, pp.247–254 (1999).