

 click to see videos 
 
Kalman filter trajectory estimation: The measurement  detection noise is set to a relatively high value, but the Kalman filter successfully predicts and corrects object trajectory.  Kalman + Camshift tracking: Camshift is used to detect the object and the Kalman filter is used to correct and estimate the object's trajectory. 
Contents
 Introduction
 Kalman Filter
 Implementation
 Conclusion
 References
Introduction
One of the primary computer vision tasks is object tracking. Object tracking is used in the vast majority of applications such as: video surveillance, car tracking (distance estimation), human detection and tracking, etc. The object trackers usually need some initialization steps such as the initial object location which can be provided manually or automatically by using an object detector such as a Viola and Jones detector or fast template matching. There are several major problems related to tracking:
 occlusion
 multiple objects tracking
 scale, illumination and change of appearance
 difficult and rapid motions
 ...
Even though object tracking has been a problem for many years, it is still not solved despite the fact that there are many object trackers, even the ones built for special purposes, and those which want to solve the problem in general.
Kalman filters, although they can be used for many other purposes, are often used for object tracking. They are especially convenient for objects which motion model is known, plus they incorporate some extra information in order to estimate the next object position more robustly. They can be used for general purpose single object tracking assuming some constraints. This article will explain the main idea behind Kalman filters and will focus on their practical usage for object tracking along with samples.
Kalman Filter  Main Idea
Let's assume that we have some kind of detector and that our detector is imperfect: it is prone to false positives, does not always detect objects, detects them imperfectly (does not provide exact position and scale) and its execution is costly. Let us also assume that we are tracking a single football player. After we detect the object, we want to leverage our information about the object in order to track it as robustly as we can. Our detector will give us just the location of the object so we have to use it as best as possible. In order to predict the next position of the player, we will need an object motion model (e.g. constant velocity motion, constant acceleration motion). Our detector is imperfect so there is noise in object locations, generally called measurement noise. Also, a selected motion model will not describe our player motion perfectly, so we also have noise regarding the model, called process noise. We want to estimate the next player position incorporating only the three parameters:
 Object motion model
 Measurement noise
 Process noise
So, we could efficiently redetect it and hopefully cope with object occlusion. These next sections will show you how to do it.

The cycle of a Kalman filter. Much of what the Kalman filter does can be reduced to propagating and updating Gaussians and updating their covariances. First the filter predicts the next state from the provided state transition (e.g. motion model), then if applicable, the noisy measurement information is incorporated in the correction phase. The cycle is repeated. 
Initial State
We are going to first introduce the initial state and what we are trying to accomplish. The following figures show the initial state and in the same time introduce most relevant terms.
 
Overview: Using only estimates and the current state, we want to predict the next state. The second step (correction) includes a noisy measurement in order to apply a state update.  Initial state type: The green line at the top represents an object we’d like to track, with the blue X’s marking the object's true position. We want to model motion by using a constant velocity model. Therefore, the state will include the object position and velocity in both directions. The detector gives us the noisy object's position, so the position is our measurement. 

Initial state value: In order to start the tracking process, we need to know the initial state x00 value. We also need to have the initial uncertainty which is expressed by Gaussian covariance matrix P00. The orange ellipse on the top represents the 2D Gaussian which describes the uncertainty in position. The initial matrix P00 is usually diagonal (assuming the components are not correlated), where the each component has its own uncertainty L  Gaussian sigma. 
Reading this section bear in mind:
 x(t)  state vector
 z(t)  measurement
 P(tt1)  process covariance matrix
Predict
Prediction is the first step which includes the next state (position and velocity) prediction as well as updating our uncertainty about the object state (increasing the uncertainty).
 
1a) State prediction: The first step is to predict the next state by using the motion model. The next state x(tt1) is obtained by multiplying the previous state by the state transition matrix  F. Q denotes process noise, which must follow normal distribution since we are working with Gaussians.  1b) Covariance prediction: The covariance update is done by multiplying the covariance matrix from the previous iteration by the state transition matrix F (motion model) and by adding the process noise Q which can be constant. The update covariance of our prediction is wider because we are less sure about our estimate. 

* State transition matrix: The motion model must be represented by matrix F, therefore it must be linear. If the model is not linear the model must be linearized in some working point, which is used in the Extended Kalman Filter. The used model models the constant 2D velocity motion model where the position is updated as: p(t) = p(t1) + v * p(t1) where p denotes position and v velocity; the velocity remains constant. Velocity is marked as derivative of position in time. If we had done the prediction step multiple times, the estimated positions would follow the constant velocity model (blue dots). The covariance matrix would get wider each time (blue ellipse) as our uncertainty about the object position grows. 
Reading this section, bear in mind:
 F  state transition matrix
 Q(t)  process noise covariance
Correct
After the noisy measurement has been obtained, the correction step begins. It incorporates a Kalman filter update which includes a state update and uncertainty update (decreasing the uncertainty).
 
* Measure: When (if) we receive a noisy measurement (in our case position obtained form a detector), the update process begins. The noisy measurement z(t) is modeled as a single Gaussian, where the noise is modeled as covariance matrix R(t) which is usually constant. The uncertainty of the measurement appears as golden ellipse.  2a) Measurement update: In order to calculate the predicted measurement needed for correction, we must select the measurement components from the state. A measurement has the same structure as a state or it just contains state parts; in our case, just the object position. Matrix H is the model selection matrix that when multiplied with state selects only elements that belong to a measurement. 
 
* Residual: In order to make correction, we must know the prediction error. Therefore the residual, also known as innovation, is calculated, denoted by y(t). The residual is calculated by differencing the predicted measurement and obtained measurement  see the green and golden ellipse. Residual covariance S is calculated in a similar way.  * Kalman gain: Kalman gain K specifies how much we believe the prediction vs. how much we believe in the measurement. It is a product of predicted process covariance matrix P the observation model H and inverse residual covariance S. Let us study two extreme cases:
 We are sure about measurements (we believe our detector)
If this is the case, the measurement noise matrix R is very small which results the K decreases and the measurements are weighted more heavily than prediction.  We are sure about prediction (we believe in our motion model, no so much in the detector)
If this is the case, the measurement noise matrix R is very large, which results in the K increases and the measurements are weighted much less than the prediction.

 
2b) Correct: The Kalman gain is now used to update the state x and covariance matrix P as shown on the figure. The result is the updated position which is denoted by the golden ellipse.  * Final step When the correction step is finished, the next step is, again, the prediction step. Those two steps are iteratively run in order to track an object as shown on the figure. 
Reading this section, you should know what is:
Now, when you know the basics, it is time for real samples, but first the brief introduction to the implementation.
Implementation
The provided implementation is generic: DiscreteKalmanFilter<TState, TMeasurement>
. The generic parameters are the state type (TState
) and the measurement type (TMeasurement
). This kind of implementation adds one abstraction layer which enables:
 easier code handling
 much easier debugging
 extensibility
The class diagram below shows the implementation outline:

Class diagram. The implementation of the Kalman filter consists of two classes  one abstract and concrete which implements discrete Kalman filter. The common motion models are also implemented. 
Assume we want to use constant velocity model and the measurement model is an object's location (just as in figures above). The usage sample for this sample is shown below.
Initialization
The most complex step is the initialization. We have to set: process and measurement noise (Q and R), process covariance matrix (P), state transition matrix (F) and model selection matrix (H). The builtin models greatly simplify this task. The initialization step is a bit longer due to flexibility. For example, the Extended Kalman Filter needs transition matrix which is changed in each step. Also, the process and measurement noise (Q and R) may not be constant so they could also be changed in each step.
using ModelState = Accord.Extensions.Statistics.Filters.ConstantVelocity2DModel;
KalmanFilter<ModelState, PointF> kalman = null;
var measurementDimension = 2;
var initialState = new ModelState { Position = new PointF(0,0), Velocity = new PointF(0.5f, 2f)};
var initialStateError = ModelState.GetProcessNoise(accelerationNoise: 10);
kalman = new DiscreteKalmanFilter<ModelState, PointF>(
initialState, initialStateError,
measurementDimension , 0 ,
x => ModelState.ToArray(x), x => ModelState.FromArray(x), x => new double[] { x.X, x.Y });
kalman.ProcessNoise = ModelState.GetProcessNoise(accelerationNoise: 10.0);
kalman.MeasurementNoise = Matrix.Diagonal<double>(kalman.MeasurementVectorDimension, 1.3);
kalman.MeasurementMatrix = ModelState.GetPositionMeasurementMatrix();
kalman.TransitionMatrix = ModelState.GetTransitionMatrix();
Predict
The prediction step consists of only one method, which if there are no measurements is executed repetitively without correction method.
kalman.Predict();
Correct
The correction method receives noisy measurement (obtained from e.g. detector).
PointF objectPosition = ....
kalman.Correct(objeectPosition);
Console.WriteLine(kalman.State.Position.X + " " + kalman.State.Position.Y);
Conclusion
The discrete Kalman Filter is described for the purpose of the object tracking problem along with its implementation in C#. If we have a linear motion model, and process and measurement noise are Gaussianlike, then the Kalman filter represents the optimal solution for the state update (in our case tracking problem). Those conditions are satisfied for a vast majority of applications.
The source and sample code are the part of Accord.NET Extensions Framework, a framework that brings many advanced algorithms primarily for image processing, object detection and tracking, all packed as fluent extensions and simple and intuitive generics, so do not forget to take a peek :).
References
History
 16^{th} January, 2015  First version released