- click to see video
|Feature tracking demo |
- Optical Flow
| || |
- NuGet packages: Vision
- Help: Off-line - Vision.PyramidalKLTOpticalFlow - unblock after download!
A vast majority of computer vision application besides per-image incorporate some sort of video processing analysis where some distinctive points are detected and tracked. There are several reasons why we need to track features:
- Object tracking applications
- Camera image stabilization
- 3D shape reconstruction
- Special movie effects
Those distinctive points are called feature points. Those points can be manually selected or automatically found. If they are automatically found, the interesting point detection procedure is called feature point detection. The features tracking is usually done in video sequences which provide information per-frame basis where object's position shift between two frames of a video is relatively small. That fact along with some other assumptions enables the feature point tracking. By tracking multiple features and drawing the feature shift vectors, a motion image called sparse optical flow image is obtained. Doing this procedure per-pixel basis, a dense flow image is obtained. Currently, in order to achieve real-time performance, dense optical flow algorithms usually run on GPUs. The samples are shown below.
| || |
|Sparse optical flow. ||Dense optical flow. |
Optical-flow algorithms can be grouped into several categories based on its base approach: gradient-based, correlation-based, frequency-based and feature-based. The procedure which is going to be described is called Kanade-Lucas-Tomassi pyramidal feature tracker (sparse optical flow). The KLT procedure is a gradient procedure and despite its date of origin, it is still widely used due to its simplicity, reasonable accuracy and speed. References section contain many interesting materials for those who want to know more.
Optical Flow Problem
The optical flow problem is the estimation of the motion vector for the selected point(s). This is shown in the image below. There are two key assumptions:
- Color or brightness constancy
- Small shift between frames
|Optical flow problem. The image demonstrates the motion estimation problem. |
The point motion can be expressed as follows:
By approximating the derivation using Taylor series, the above equation can be approximated as follows:
By applying the brightness constancy assumption, the brightness difference between starting and shifted point must be equal to zero. Or more mathematically:
There are two unknowns (u and v) but only one equation. Obviously, some other assumption has to be made in order to solve the optical flow problem. Here KLT optical flow assumption jumps in.
Lucas and Kanade assume that the motion field is smooth. In another words, the points close to the tracked points are moved in the same direction. This assumption is reasonable for many objects. The constant moving region is determined by the window size. Let us take 5x5 window. Then we will have 25 equations - more than enough to determine the next position of the point.
Other assumptions can be made as well. That is the reason why many more algorithms exist. In order to solve this overdetermined system, we minimize the modified function - least square problem.
Notice that AtA matrix must be invertible. So, this system is solvable if the eigenvalues of AtA matrix are not too small. In other words, if the patch is corner or high textured region. In case of an edge, one eigen-value is large, but the other one is too small which is also not satisfying.
Therefore, in order to calculate the LK flow of a single patch, the following procedure is used:
- Repeat until convergence
- Calculate horizontal, vertical and time derivative of the patch
- Calculate the shift by solving the:
(eigen-values of the AtA matrix are validated in order to determine the patch quality)
- Warp patch by the calculated shift
- Get total shift vector
The object shifts are in many cases more than "small". If the shift is big, the small shift approximation (Taylor) will be violated, and the result would be object tracking inability. In order to cope with this problem, image is shrinked. The shift in the shrinked image is smaller and the algorithm will be able to track the object (assuming that some another assumptions re not validated). To support even larger shifts instead of only one shrinked image, a whole Gaussian pyramid of the input image is built.
|Pyramidal optical flow procedure outline. |
The implementation has several advantages that are worth mentioning:
- Support for color images
Optical flow implementation is capable to incorporate the color information as well which can increase tracking robustness.
- Detailed tracking status
The implementation provides the detailed tracking status for each tracked region (see
KLTFeatureStatus for details).
- real-time processing
The implementation is optimized to run in real-time. It is designed for real-time video processing (see
The implementation is divided into three classes. Main classes are
KLTOpticalFlow which contain methods for calculating the KLT optical flow. Feature status is determined by the
KLTFeatureStatus enum. The implementation supports color images as well - just change the color. When processing video in order to avoid the duplicate calculation of the horizontal and vertical derivatives, the
PyrLKStorage class is used. This class acts as a container for the metadata for the previous frame. The main method
EstimateFlow supports the image input or the storage input. The class diagram is shown below.
|Class diagram. |
The simple usage (without using
PyrLKStorage and omitting optional parameters for the
EstimateFlow function) is shown below.
Gray<float>[,] prevIm = ...;
Gray<float>[,] currIm = ...;
PointF oldPositions = ...;
PyrLKOpticalFlow<Gray>.EstimateFlow(prevImg, currImg, oldPositions,
out currFeatures, out featureStatus);
for (int i = 0; i < currFeatures.Length; i++)
This article presents the pyramidal Kanade-Lucas-Tomassi optical flow which forms the base for many computer-vision applications such as object tracking, augmented-reality and image stabilization. The code contains the complete source as well as feature tracking sample. The source and sample code are the part of Accord.NET Extensions Framework, a framework that brings many advanced algorithms primarily for image processing, object detection and tracking, all packed as fluent extensions and simple and intuitive generics, so do not forget to take a peek :).
- 12th November, 2014 - First version released