A core objective in the design of integrated navigation systems is optimal fusing of noisy observations from global positioning systems (GPS), inertial measurement units (IMU), and other sensors. Current navigation systems use the extended Kalman filter (EKF) as the standard technique for integrating sensor data. In an inertial navigation system (INS), the EKF combines gyroscope and accelerometer data from an IMU with a kinematics or dynamic model of a vehicle. Other sensors such as a barometric altimeter or magnetic compass may also be integrated. Supplied GPS position and velocity measurements are then integrated with the INS using the same EKF. Position, attitude, velocities, and sensor biases are all estimated. In addition, the GPS receiver may employ its own EKF to solve position and velocity estimates and timing from satellite pseudo-range, phase, and Doppler data. Alternatively, in a tightly coupled approach, a single EKF may be used to combine raw satellite signals with the IMU and other sensor measurements to make an optimal estimation. Unfortunately, the EKF is based on a suboptimal implementation of the recursive Bayesian estimation framework applied to Gaussian random variables. The EKF achieves only first order accuracy in the calculation of the posterior mean and covariance of the transformed random variables. This can seriously affect the accuracy or even lead to divergence of the system.
The basic framework for the EKF entails estimation of the state of a discrete-time nonlinear dynamic systemxk+1=F(xk,vk)yk=H(xk,nk),where xk represents the unobserved state of the system and yk is the only observed signal. The process noise vk drives the dynamic system, and the observation noise is given by nk. Additivity of the noise sources is not assumed. The system dynamic model F and H are assumed known. In state estimation, the EKF is the standard method of choice to achieve a recursive approximate maximum-likelihood estimation of the state xk. Given the noisy observation yk, the recursive estimation for xk can be expressed in the form{circumflex over (x)}k=(optimal prediction of xk)+κk[yk−(optimal prediction of yk)].This recursion provides the optimal minimum mean-squared error estimate for xk, assuming the prior estimate of {circumflex over (x)}k−1, and current observation yk are Gaussian random variables (GRVs). Linearity of the model need not be assumed. The optimal terms in this recursion are given by{circumflex over (x)}k−=E[F({circumflex over (x)}k−1,vk−1)] κk=PxkykP{tilde over (y)}k{tilde over (y)}k−1ŷk−=E[H({circumflex over (x)}k−,nk)],where the optimal prediction of xk is written as {circumflex over (x)}k− and corresponds to the expectation of a nonlinear function of the random variables {circumflex over (x)}k−1− and vk−1 (similar interpretation for the optimal prediction ŷk−). The optimal gain term κk is expressed as a function of posterior covariance matrices (with {tilde over (y)}k=yk−ŷk). These terms also require taking expectations of a nonlinear function of the prior state estimates. The Kalman filter calculates these quantities exactly in the linear case and can be viewed as an efficient method for analytically propagating GRV through linear system dynamics. For nonlinear models, however, the EKF approximates the optimal terms as:{circumflex over (x)}k−≈F({circumflex over (x)}k−1, v) κk≈{circumflex over (P)}xkyk{circumflex over (P)}{tilde over (y)}k{tilde over (y)}y−1ŷk−≈H({circumflex over (x)}k−, n),where predictions are approximated as simply the function of the prior mean value for estimates (no expectation taken). The noise means are denoted by n=E[n] and v=E[v] and are usually assumed to be equal to zero. The covariances are determined by linearizing the dynamic equations (xk+1≈Axk+Bvk, yk≈Cxk+Dnk) and then determining the posterior covariance matrices analytically for the linear system. In other words, in the EKF, the state distribution is approximated by a GRV, which is then propagated analytically through the “first-order” linearization of the nonlinear system. As such, the EKF can be viewed as providing “first-order” approximations to the optimal terms. (While “second-order” versions of the EKF exist, their increased implementation and computational complexity tend to prohibit their use.) These approximations can, however, introduce large errors in the true posterior mean and covariance of the transformed (Gaussian) random variable, which may lead to sub-optimal performance and sometimes divergence of the filter.
Julier, S., Uhlmann, J., and Durrant-Whyte, H., “A New Approach for Filtering Nonlinear Systems,” in Proceedings of the American Control Conference (1995), derived a novel, more accurate, and theoretically better grounded alternative to the EKF called the unscented Kalman filter (UKF) for state estimation within the application domain of nonlinear control. The UKF addresses the approximation issues of the EKF. The state distribution is again represented by a GRV but is now specified using a minimal set of carefully chosen sample points, called sigma points. These sigma points completely capture the true mean and covariance of the GRV, and when propagated through the true nonlinear system, capture the posterior mean and covariance accurately to the third order (Taylor series expansion) for any nonlinearity. The UKF can be explained by considering propagating a random variable x (dimension L) through a nonlinear function, y=g(x), and assuming x has mean x and covariance Px. To calculate the statistics of y, a matrix χ of 2L+1 sigma vectors χi (with corresponding weights Wi) is formed, according to the following:χ0= xW0(m)=λ(L+λ)−1 χi= x+(√{square root over ((L+λ)Px)})ii=1, . . . , L W0(c)=λ(L+λ)−1+(1−α2+β)χi= x−(√{square root over ((L+λ)Px)})i−Li=L+1, . . . , 2L Wi(m)=Wi(c)=0.5(L+λ)−1i=1, . . . , 2L where λ=α2(L+κ)−L is a scaling parameter. α determines the spread of the sigma points around x and is usually set to a small positive value (e.g. 1≦α≦1e−3). κ is a secondary, scaling parameter that is usually set to 0, and β is used to incorporate prior knowledge of the distribution of x (for Gaussian distributions, β=2 is optimal). The optimal value of β is a function of the kurtosis of the prior random variable. (√{square root over ((L+λ)Px)})i is the ith row of the matrix square root. These sigma vectors are propagated through the nonlinear function
                              χ          0                =                  x          _                                              W          0                      (            m            )                          =                              λ            ⁡                          (                              L                +                λ                            )                                            -            1                                                            χ          i                =                              x            _                    +                                    (                                                                    (                                          L                      +                      λ                                        )                                    ⁢                                      P                    x                                                              )                        i                                                        W          0                      (            c            )                          =                                            λ              ⁡                              (                                  L                  +                  λ                                )                                                    -              1                                +                      (                          1              -                              α                2                            +              β                        )                                                            i          =          1                ,        …        ⁢                                  ,        L                                                                      χ          i                =                              x            _                    -                                    (                                                                    (                                          L                      +                      λ                                        )                                    ⁢                                      P                    x                                                              )                                      i              -              L                                                                    W          i                      (            m            )                          =                              W            i                          (              c              )                                =                      0.5            ⁢                                          (                                  L                  +                  λ                                )                                            -                1                                                                                      i          =                      L            +            1                          ,        …        ⁢                                  ,                  2          ⁢          L                                              i          =          1                ,        …        ⁢                                  ,                  2          ⁢          L                    and the mean and covariance of y are approximated using a weighted sample mean and covariance of the posterior sigma points
                              y          _                ≈                              ∑                          i              =              0                                      2              ⁢              L                                ⁢                                    W              i                              (                m                )                                      ⁢                          Υ              i                                                                                                                            P          y                ≈                              ∑                          i              =              0                                      2              ⁢              L                                ⁢                                                    W                i                                  (                  c                  )                                            ⁡                              (                                                      Υ                    i                                    -                                      y                    _                                                  )                                      ⁢                                                            (                                                            Υ                      i                                        -                                          y                      _                                                        )                                T                            .                                          
This specific implementation of the general sigma-point approach for calculating posterior statistics, is called the unscented transformation (UT) [Julier 1995]. The sigma-point approach differs substantially from general “sampling” methods (e.g., Monte-Carlo methods and grid-based filters), which require orders of magnitude more sample points in an attempt to propagate an accurate (possibly non-Gaussian) distribution of the state. The deceptively simple approach results in approximations that are accurate to the third order for Gaussian inputs for all non-linearities. For non-Gaussian inputs, approximations are accurate to at least the second-order, with the accuracy of third and higher order moments determined by the choice of α and β. A simple example is shown in FIG. 1 for a two-dimensional system. The left-hand plots show the true mean and covariance propagation using Monte-Carlo sampling; the middle plots show the results using a linearization approach as would be done in the EKF; and the right-hand plots show the performance of the sigma-point approach (only 5 sigma points required for L=2). The superior performance of the sigma-point approach is clear.
The sigma-point Kalman filter (SPKF) is a better alternative to the EKF for Gaussian approximate probabilistic inference in general nonlinear dynamic state-space models. SPKF is a family of Kalman filters using derivativeless deterministic sampling for Gaussian approximate nonlinear estimation. The sigma-point approach is the underlying unifying technique that is common to SPKFs and is a method of calculating the statistics of a random variable that undergoes a nonlinear transformation. These calculations form the core of the optimal Kalman time and measurement equations, which are simply the original (optimal) recursive Bayesian estimation integral equations recast under a Gaussian assumption.
The sigma-point approach for approximating the statistics of a random variable that undergoes a nonlinear transformation is summarized by the following three steps:
1. A set of weighted sigma-points is deterministically calculated using the mean and square-root decomposition of the covariance matrix of the prior random variable. As a minimal requirement, the sigma-point set completely captures the first and second order moments of the prior random variable. Higher order moments can be captured, if so desired, at the cost of using more sigma-points.
2. The sigma-points are then propagated through the true nonlinear function using functional evaluations alone, i.e., no analytical derivatives are used, to generate a posterior sigma-point set.
3. The posterior statistics are calculated (approximated) using tractable functions of the propagated sigma-points and weights.
When the UT implementation of the sigma-point approach is used, the resulting filter is the UKF. The UKF is one specific implementation of SPKF.
The superior performance of the sigma-point approach would make desirable an application of SPKF to nonlinear estimation and sensor fusion in the operation of vehicle navigation systems.