Kalman filtering demo

HoshiKata

4.93/5 (53 votes)

Feb 8, 2012

CPOL

9 min read

295628

14372

A short demonstration of how to write and use a simple Kalman filter.

Download demo code - 30.21 KB

KalmanDemoCode

Introduction

Kalman filters allow you to filter out noise and combine different measurements to compute an answer. They are a particularly powerful type of filter, and mathematically elegant. Without a matrix math package, they are typically hard to compute, examples of simple filters and a general case with a simple matrix package is included in the source code. Developed by Rudolf Kalman and others as an ideal way to estimate something by measuring something, its vague applicability (estimate something by measuring something) is where it is so powerful, there doesn’t need to be a direct relationship between measured and estimate.

Background

Example 1

Suppose you need to measure a car’s position, but your position estimate is accurate to +/- 10 meters at any one time, but the velocity measurement is accurate to 0.1 meters. Further if the velocity noise error is due to a different physical process than the position measurement, both the velocity and position measurements can be combined in a Kalman filter to compute the position. It will be particularly accurate because the error biasing any one estimate due to position errors will be mitigated by velocity and vice versa.

Example 2

Suppose you have GPS ranges to satellites (called pseudo ranges) the errors of any one set of measurements will be uncorrelated: the nominal case is when one is higher the other will be lower. They have separate clock based measurement errors and pass through different areas of the atmosphere. The solution of computing position from the pseudo ranges can be done by matrix least squares mathematics, which can be rearranged as an iterative non-linear solution that … improves an estimate of the location till the error approaches zero. Good GPS systems usually take that formulation a bit farther and rewrite it as a Kalman filter. A Kalman filter for navigation can also combine the Doppler (different kind of noise) accumulated carrier, fractional carrier, accelerometers etc.

Demystifying Kalman Filters

There are typically 2 sets of equations: one to update the “state”, another to use the latest “state” computed in the past to predict what the system is or will be in the future. For example if you get measurements 10x a second but need to make rocket course corrections 100x a second, you update the filter at 10x, but in between updates you predict the state forward to make the course corrections 10 time.

The general theory is to add the measurement into the current estimate using a gain based on how certain the last estimate was. If the estimates are stable, the gain is small and only a small effect of the measurement is filtered in.

Predict:

X = F*X + H*U Rolls state (X) forward to new time.

P = F*P*F^T + Q Rolls the uncertainty forward in time.

Update:

Y = M – H*X Called the innovation, = measurement – state.

S = H*P*H^T + R S= Residual covariance transformed by H+R

K = P * H^T *S^-1 K = Kalman gain = variance / residual covariance.

X = X + K*Y Update with gain the new measurement

P = (I – K * H) * P Update covariance to this time.

Where:

X = State

F = rolls X forward, typically be some time delta.

U = adds in values per unit time dt.

P = Covariance – how each thing varies compared to each other.

Y = Residual (delta of measured and last state).

M = Measurement

S = Residual of covariance.

R = Minimal innovative covariance, keeps filter from locking in to a solution.

K = Kalman gain

Q = minimal update covariance of P, keeps P from getting too small.

H = Rolls actual to predicted.

I = identity matrix.

The key to a good Kalman filter design in my experience is noting how each measurement can be transformed into the state, and how the noises of state and measurements match up in terms of how they are related or not. In the above equations, each letter is not a single number but an object (matrix).

Keeping the order (matrix size) small or at least sparse (lots of zeros in the matrix) is important. The key thing in the equations above is S^-1 is the inverse of S (equivalent of multiplying by 1/S). The inverse of S isn’t always something you can just compute, so usually a pseudo inverse is used. This can be hard to compute if there are values near zero (round off). In short … either do a lot of reading and work and learn about the different approaches or keep the matrix small and or use a math package.

It is usually best from an efficiency and ease of use standpoint to break each thing into a small matrix based process and chain them together. This is usually a good idea because each process in the chain will likely be easier to understand, describe and test. Each will also have it's own special cases for handling reset, coast due to lack of data, bounds checking due to boundary conditions, etc.

Using the code

Suppose you get a position measurement when you can (time between measurements vary) and you know the position is noisy, but the average velocity of what your measuring should be changing slowly and very smoothly. This is a second order, time varying Kalman filter.

X = a vector, X[0] =position, X[1] = velocity.

P = a 2x2 matrix (4 numbers)

Q = minimal covariance (2x2).

R = single value.

H = [1, 0], we measure only position so there is no update of state.

To estimate forward from this time by delta (dt):

X = F*X + H*U

P = F*X*F^T +Q (computed below during update).

F = 2x2 matrix: [1, dt], [0, 1].

U = 0, If we knew position always increased at baseline velocity, U would be a function of dt.

So H*U = 0

X[0] = F[0, 0]*X[0] + F[0,1] * X[1] = 1 * X[0] + dt*X[1] = X[0] + dt * X[1]

X[1] = F[1,0]*X[0] + F[1,1]*X[1] = 0 * X[0] + 1 * X[1] = X[1].

double EstimatePredicition(double dt)
{
  return m_x[0] + (dt * m_x[1]);
}

To update:

Y = M – H*X Called the innovation = measurement – state transformed by H.

S = H*P*H^T + R S= Residual covariance = covariane transformed by H + R

K = P * H^T *S^-1 K = Kalman gain = variance / residual covariance.

X = X + K*Y Update with gain the new measurement

P = (I – K * H) * P Update covariance to this time.

Predict:

X = F*X + H*U Rolls state (X) forward to new time.

P = F*P*F^T + Q Rolls the uncertainty forward in time.

The X = F*X + H*U was converted into code above:

oldX = m_x[0];

m_x[0] += dt * m_x[1]; // From above.

Converting P requires some math:

P = [1, dt] [p00, p01] [1, 0] + [q00, q01]

[0, 1] [p10, p11] [dt, 1] [q10, q11]

P = [p00 + dt*p10, p01 + dt*p11] [1, 0] + [q00, q01]

[p10, p11] [dt, 1] [q10, q11]

P = [p00 + dt*p10 + dt*p01 + dt*dt*p11, p01 + dt*p11] + [q00, q01]

[p10 + dt*p11, p11] [q10, q11]

In code:

m_p[0] = m_p[0] + dt*(m_p[2] + m_p[1]) + dt*dt *m_p[3] + m_q[0];

m_p[1] = m_p[1] + dt*m_p[3] + m_q[1];

m_p[2] = m_p[2] + dt*m_p[3] + m_q[2];

m_p[3] = m_p[3] + m_q[3];

Y = M – H*X

y0 = m - m_x[0];

y1 = ((oldX-m)/dt) -m_x[1];

S = H*P*H^T + R

Because H = [1, 0] this is easy, and s is a single value not a matrix to invert.

s = m_p[0] + m_r;

K = P * H^T *S^-1

K = m_p[0] / s;

X = X + K*Y

m_x[0] += y0 * k;

m_x[1] += y1 * k;

P = (I – K * H) * P

Because H is [1, 0], and K is s scalar in this case.

for (int i = 0; i < 4; i++) m_p[i] = 1 – k*m_p[i];

The following fills out the API:

double Variance () { return m_p[0]; } // positional variance.
double Value() { return m_x[0]; } // latest position.
void Reset(
double qx, // Measurement to position state minimal variance. 
double qv, // Measurement to velocity state minimal variance.
double r, // Measurement covariance (sets minimal gain).
double pd, // Initial variance.
double ix) // Initial position 
{
  m_q[0] = qx; m_q[1] = qv;
  m_r = r;
  m_p[0] = m_p[3] = pd; 
  m_p[1] = m_p[2] = 0;
  m_x[0] = ix; 
  m_x[1] = 0;
}

Another example

Suppose you had two measurement of the same thing, say position measured by GPS, and velocity measured by an accelerometer. The measurement techniques do not vary the same way because the sources of noise are unrelated (little noise cross correlations) and the amount of noise is typical of a measurement system, it is Gaussian. The position noise is large, say 15 meters, but the velocity noise is low, say 0.01 m/s.

This is a good example of how a Kalman filter can really use the low noise velocity information to fix position information that might be noisy.

The Kalman filter learns the velocity over time, and essentially trusts the velocity information to update the position estimate more than the position measurement.

If we introduced a jump in position, the position would jump and the filter would re-adapt.

If the change evolved with the signal like most tracking signal problems do, then you have scenarios such as:

In this case, the velocity went from 0.4 to -0.4 in the middle of the test. The abrupt change in velocity was smoothed based on the noise value used. If we reduced the Qv value to say 0.001 from 0.01 in the above graph the filter would rely on velocity too much:

Example of tracking a parabola with a 2nd order Kalman filter:

Note the top right is the path, top left is the error showing measurement error and the Kalman filter error from the ideal. The Gain plot is the magnitude of the determinant of the Kalman gain matrix (which isn’t a single value).

Conclusion

This is meant to just be an introduction, a jumping off spot for someone kind of new to Kalman filtering.

On the job, I've developed Kalman filters for image object tracking, missile guidance filters, uplink telemetry timing filters, GPS tracking and navigation filters. Some were small (2-4 state) some large (13) some very large (23). Some took days, others weeks. The most effective and reliable were always small.

Usually it isn't easy. Especially if you have three or more states. By far the most common mistake is making them too big and never quite finding out what's wrong. Figuring out the cross interactions is a science, but it's often good to be deliberately a little more or less aggressive in how it smooths or how quickly it adapts, so... doing that for 20 seemingly unrelated things is tuning 20*20 often non-linear things, by hand.

Kalman filters always work best when they are kept as basic as possible.