An Introduction to OpenCV: Displaying and Manipulating Video and Motion Data

Markus Koppensteiner

4.57/5 (12 votes)

Mar 2, 2012

CPOL

10 min read

80440

5582

Explains how to use some OpenCV commands for video manipulation.

Introduction

The Open Source Computer Vision Library (OpenCV) offers freely available programming tools to handle visual input such as images, video files, or motion data captured by a camcorder. It contains numerous ready-to-use functions for a wide range of applications. Since the author of this piece of information is only at the beginning of discovering this world and due to the great number of features OpenCV provides, the contents of this article will be confined to a small selection of topics.

Although one can find instructions on how to use OpenCV and many code samples, to me it seemed that useful information is spread all over the web and thus hard to track down. It is the intention of this article to provide a more coherent picture and offer guidance for some OpenCV basics.

I am mostly interested in video processing and analyses of motion. For this reason the article discusses OpenCV functions for loading and playing video files, it gives some examples of how this data can be manipulated and it presents (possibly ‘immature’) sample code that shows how to set up such procedures. Although the OpenCV function calls introduced here are pure C code, I packed them into a single C++ class. This might not seem very elegant but I considered it convenient to have most of the OpenCV specific code at one place. In order to be able to implement a Windows GUI, I built a Win32 program around the OpenCV code using Visual C++ 2010 Express Edition.

The contents presented here and in the source code files are based on code snippets found on the web, on samples that come with the OpenCV set up, and partly on the book ‘Learning OpenCV’ by Gary Bradski and Adrian Kaehler.

In order to create an executable program, the OpenCV2.3 libraries (http://opencv.willowgarage.com) have to be installed and it is necessary to include the correct libraries and header files. This requires some instructions to be followed, which are described on top of the UsingOpenCV.cpp source code file. Detailed descriptions (including pictures) on how this needs to be done can be found on the web also (e.g., by searching for ‘Using OpenCV in VC++2010’).

The article is divided into three parts. In the first section I will explain how to load and play a video file. Section two and three will build on this but also add further information on pixel manipulation, frame subtraction, and optical flow.

Load and Play Video Files

OpenCV provides several functions for loading and playing video files. These have to be called in a relatively strict order (also be sure that corresponding video codecs are installed on your computer).

In order to capture frames (i.e., single images) from a video file, it is necessary to use the command line CvCapture* cvCreateFileCapture(const char* filename ); first. It provides a pointer to a CvCapture structure (returns NULL if something went wrong). In the sample code I wrote, this function is embedded in the class method bool Video_OP::Get_Video_from_File(char* file_name);.

Once there is a valid CvCapture object, it is possible to start grabbing frames using the function IplImage* cvQueryFrame(CvCapture capture), which returns a pointer to an IplImage array. How this needs to be implemented will be explained below on the basis of sample code.

In order to access the properties of a video file (e.g., length, width of a frame, etc.), use cvGetCaptureProperty(CvCapture* capture, int property_id);. Sample code on how to apply this can be found in the class methods int Video_OP::Get_Frame_Rate(); and int Video_OP::Get_Width();.

In order to set various properties of a video file (e.g., starting frame), use cvSetCaptureProperty(CvCapture* capture, int property_id, double value);. Sample code on how to change the starting frame can be found in the class method int Video_OP::Go_to_Frame(int frame).

Using the Code

The main steps to load and play a video file are:

Capture a video file with cvCreateFileCapture(); by invoking this->Get_Video_from_File(char* file_name);.
Invoke this->Play_Video(int from, int to); with from being the starting frame and to the stopping frame (find the insides of the method in the skeleton code below).
Determine frame rate of video, create window to depict images, and push video forward to a frame from which you want it to start.
Set up loop to process successive frames (or images) of a video file.
Grab frames by calling cvQueryFrame(CvCapture*);.
Depict images in a window by calling cvShowImage(char*,IplImage);.
Define the delay of presentation by invoking cvWaitKey(int);.

The example in the file (OpenCV_VideoMETH.cpp) has the same structure as the code below, but also has some lines added, which gives further information.

//
// code skeleton for playing a video
//
void Video_OP::Play_Video(int from, int to)
{
    this->my_on_off = true;
    int key =0;
    int frame_counter = from;

    // retrieves frames per second (fps); is used to define speed of presentation; see (A)         
    // or: (int) cvGetCaptureProperty( this->my_p_capture,CV_CAP_PROP_FPS );
    int fps = this->Get_Frame_Rate();
           
    // creates Window in which movie is displayed; see below (B)
    cvNamedWindow( "video" , CV_WINDOW_AUTOSIZE ); 
           
    // sets pointer to frame, where the video shall start from                                                   
    // or: cvSetCaptureProperty (this->my_p_capture, CV_CAP_PROP_POS_FRAMES, (double) from );
    this->Go_to_Frame(from); 
           
    // variable frame_counter is used to stop video after ‘last’ frame (= to) has been grabbed
    int frame_counter = from;

    // creates a loop, which is stopped after video reaches position 
    // of last frame (= to) or after my_on_off = false (see class method this->Stop_Video();)
    while(this->my_on_off == true && frame_counter <= to) { 
         
        // gets a frame; my_p_capture pointer is initialized in this->Get_Video_from_File() method
        this->my_grabbed_frame = cvQueryFrame(this->my_p_capture);
              
        // check if frame is available
        if( !this->my_grabbed_frame ) break;
            
        // displays grabbed image; see above (B) 
        cvShowImage( "video" ,my_grabbed_frame);
           
        frame_counter++;
                       
        // program waits until the time span of 1000/frame
        // rate milliseconds has elapsed; see above (A)
        key = cvWaitKey(1000 /fps);
    }
 
    //cleaning up  
    cvReleaseCapture( &my_p_capture ); 
    cvDestroyWindow( "video" );
}
 
...

Pixel Manipulation and Frame Subtraction

Explanations in this section provide information about a selection of OpenCV commands that can be used for image manipulation. I present a simple way of accessing single pixels, how to subtract successive frames, how to draw onto an image, and give an idea for which kind of applications this might be useful.

Using the Code

The basic structure of the code presented in the following builds on the sample shown in the first section. I just explain the most important parts that have been added. For details and a better understanding, also study the file OpenCV_VideoMETH.cpp.

The most important steps in the code sample below are:

Grab a frame using cvQueryFrame(cvCapture);, clone this frame, and turn it into a grey-scale image.
Set up loop to process successive frames.
Grab next frame and also turn it into a grey-scale image.
Subtract grey scale frame grabbed first from the grey scale frame grabbed afterwards (see cvAbsDiff(CvArr *src,CvArr* src2,CvArr *dst);) . The result yields a grey-scale image with different levels of grey (i.e., 0 is black and 255 is pure white) at the spots where a change in pixel color between frame(t) and the previous frame(t-1) has taken place. In other words, these changes give an estimate of the motion occurring.
Search the image arrays for pixel values above a certain threshold (here 100) by invoking cvGet2D(CvArr*,int,int);.
Calculate mean x-positions and mean y-positions of the pixels found and draw a circle at that spot.
Clone the grabbed frame, thereby storing its information for the next subtraction (i.e., frame(t) becomes the new frame(t-1)).

As already shown in the first section, the program runs through a video file stepwise (i.e., frame by frame) but this time it also subtracts each single frame from its predecessor. This is a simple technique to give a rough estimate of the amount of motion that has occurred between two frames. Also, it is possible to track the path along which a single object moves (as it is done here by determining the average position of the pixels above a certain threshold; see pictures above and on top of page). For some applications, this might suffice, but if it is necessary to track the path of several objects, only a more elaborated methodology will do the job. It will not be discussed here (admittedly, I am not yet familiar with it), but the OpenCV library even offers solutions for such complex problems (e.g., algorithms to detect blobs).

//
// code skeleton for subtracting successive frames
//
void Video_OP::Subtract_Successive_Frames(int from, int to) 
{
    //............omitted code
 
    //cvScalar contains arrays [0] to [3]; [1] to [3] correspond to the RGB model
    // in gray-scale images (1 channel) val[0] returns brightness (see below B)
    CvScalar color_channels;
 
    // captures the first frame
    this->my_grabbed_frame = cvQueryFrame(this->my_p_capture);
 
    //...........
    // clones first frame after it has turned into a grey-scale (i.e. 8 bit) picture
    cvCvtColor(this->my_grabbed_frame, first_frame_gray, CV_RGB2GRAY );
    cloned_frame =cvCloneImage(first_frame_gray);
 
    // creates a loop, which is stopped by setting this->my_on_off = false;           
    //see this->Stop_Video() method;
    while(this->my_on_off == true && frame_counter <= to) { 

        //...........  
                
        // captures the next frame 
        my_grabbed_frame = cvQueryFrame( my_p_capture);
                 
        //converts grabbed frame to grey frame
        cvCvtColor(my_grabbed_frame, running_frame_gray, CV_RGB2GRAY );
 
        // subtracts grey image (t) from grey image (t-1) and stores the result in subtracted frame
        cvAbsDiff(cloned_frame,running_frame_gray,subtracted_frame );
                
 
        // loops through image array,accesses pixels with cvGetAt or cvGet2D and calculates mean
        for ( double i = 0; i < subtracted_frame->width; i++){
            for ( double u = 0; u < subtracted_frame->height; u++)
            {
                // gets pixel color (gray-scale values ranging from 0 to 255 (=white)); 
                // see above B
                color_channels = cvGet2D(subtracted_frame,u,i);
                // 100 is threshold; if color value is higher than 100, 
                //pixel will be used for calculations
                if( color_channels.val[0] > 100){
                    pixel_count++;
                    // sums up x and y coordinates of pixels identified
                    sum_x += i;
                    sum_y += u;
                }
            }
        }

        // calculates mean coordinates by dividing counted pixels by the coordinates of each dimension
        mean_x = sum_x/pixel_count;
        mean_y = sum_y/pixel_count;
 
        // if calculations yield useful values then draw a circle at that spot                 
        if (mean_x < subtracted_frame->width && mean_y < subtracted_frame->height)
            // cvCircle(IplImage,CvPoint,int radius,CvColor,int thickness, int line_type, int shift);
            cvCircle(pixels_frame, cvPoint(mean_x,mean_y ),4,cvScalar(0,255,0,0), 2,8,0);
 
        //...........
        // displays results in different windows
        cvShowImage("subtraction",subtracted_frame);
        cvShowImage( "movie", my_grabbed_frame );
        cvShowImage("Pixels identified",pixels_frame);
 
        // clones current frame, so it can be the previous frame for the next step in the loop
        cloned_frame = cvCloneImage(running_frame_gray); 
         
        key = cvWaitKey( 1000 / fps );
    }
 
    //...........
}
 
...

Optical Flow

In the example above, I presented code that filters changes in pixel color between two frames. These changes can be used to assess the amount of motion and find out where something interesting is happening. The code in this section will be based on this and present an alternative (and more refined and complicated) methodology to detect and analyze motion.

Optical flow is a way to trace the path of a moving object by detecting the object’s position shifting from one image to another. This is mostly done by the analyses of pixel brightness patterns. There are several methodologies to calculate optical flow, but in this article I only provide code based on the Lucas-Kanade method. Since I am not an expert on the underlying mathematics and the insides of the applied functions, I do not elaborate on such details. I only give a very superficial description of how it works and focus on the OpenCV functions that have to be invoked to create a useable program. If you feel a need for more information, you may find it in an expert article or a book dealing with such topics.

The Lucas Kanade algorithm makes three basic assumptions. First, the brightness of a pixel (i.e., its color value) remains constant as it moves from one frame to the next. Second, an object (or an area of pixels) does not move very far from one frame to the next. Third, pixels ‘inhabitating’ the same small area belong together and move in a similar direction.

One problem of such a procedure is to find good points to track. A good point should be unique, so it is easy to find it in the next frame of a movie (i.e., corners mostly fulfill such requirements). The OpenCV function cvGoodFeaturesTrack() (based on the Shi-Tomasi algorithm) was built to solve this problem and the obtained results can be further refined by cvFindCornerSubPix();, which takes accuracy of feature detection to the sub-pixel level.

The functions needed for the example below have numerous parameters, which will not be explained in great detail. Please read the comments in OpenCV_Video_METH.cpp, study a book on OpenCV or an OpenCV documentation for more information.

Using the Code

The most important steps in the code sample below are:

Define a maximum of points (or corners) for the Shi Tomasi algorithm (will be overwritten with the number of corners found after cvGoodFeaturesToTrack(); is left). Also provide an array to store the points the function finds.
Define image pyramids for the Lucas-Kanade algorithm. Image pyramids are used to estimate optical flow iteratively by beginning at a more or less coarse level of analysis and refining it at each step.
Call cvGoodFeaturesToTrack();. The function needs the image to be analyzed in an 8 bit or 32 bit single channel format. The parameters eigImage and tempImage are necessary to temporarily store the algorithm’s results. Similarly, a CvPoint2D32f* array is used to store the results of corner detection. The parameter qualityLevel defines the minimal eigenvalue for a point to be considered as a corner. The parameter minDistance is the minimal distance (in number of pixels) for the function to treat two points individually. mask defines a region of interest and blockSize is the region around a pixel that is used for the function’s internal calculations.
Call cvFindCornerSubPix();. Pass the image array to be analyzed and the CvPoint2D32f* array, which has been filled with values found by the Shi Tomasi algorithm. The parameter win defines the size of the windows from which the calculations start. CvTermCriteria defines the criteria for termination of the algorithm.
Draw tracked features (i.e., points) onto the image.
Set up loop for processing frames of the video.
Call cvCalcOpticalFlowPyrLK();. The most important parameters of this function are: the previous frame, the current frame (both as 8 bit images in the example here), an identical pyramid image for the current and the previous frame, the CvPoint2D32f* array, which has been filled with the points identified as good features, and an additional CvPoint2D32f* array for the new positions of these points. Furthermore, winSize defines the window used for computing local coherent motion. The parameter level sets the depth of stacks used for the image pyramids (if set to 0, none are used), and CvTermCriteria defines the criteria for termination of the computations.
Draw a line between points of the former frame (stored in the CvPoint2D32f* array) and their new positions in the current frame.
Turn data of the current frame into ‘previous’ data for the next step of the loop.

//
// code skeleton for optical flow
//
void Video_OP::Calc_optical_flow_Lucas_Kanade(int from, int to)
{
    //............omitted code
 
    // Maximum number of points for Shi-Tomasi algorithm
    const int MAX_COUNT = 500;
    // CvPoint2D32f is an Array of 32 bit points to contain the tracked features 
    // after Shi Tomasi algorithm has been run; see below cvGoodFeaturesToTrack
    CvPoint2D32f* points[2] = {0,0}, *swap_points;
            
    char* status = 0;
 
    //............
        
            
    // Image pyramids are necessary for the Lucas-Kanade algorithm to work correctly 
    pyramid = cvCreateImage( cvGetSize(running_frame), IPL_DEPTH_8U, 1 );
    prev_pyramid = cvCreateImage( cvGetSize(running_frame), IPL_DEPTH_8U, 1 );
 
    // allocates memory for CvPoint2D32F Arrays
    points[0] = (CvPoint2D32f*)cvAlloc(MAX_COUNT*sizeof(points[0][0]));
    points[1] = (CvPoint2D32f*)cvAlloc(MAX_COUNT*sizeof(points[0][0]));
           
            
    //............ 
 
    // temporary image information and eigenvalues are stored in images below
    IplImage* eig = cvCreateImage( cvGetSize(grey), 32, 1 );
    IplImage* temp = cvCreateImage( cvGetSize(grey), 32, 1 );
    // quality level (minimal eigenvalue for values to be used) should not exceed 1; 
    // typical values are 0.10 or 0.01
    double quality = 0.01;
    // min_distance guarantees that no two returned points are within a certain area
    double min_distance = 10;
    count = MAX_COUNT;
 
    // finds features to track and stores them in array (here: points[0])
    // void cvGoodFeaturesToTrack(cvArr* image, cvArr* eigImage, cvArr* tempImage, 
    // CvPoint2D32f* corners, int* cornerCount, double qualityLevel, double minDistance,
    // cvArr* mask, int blockSize, int useHarris, double k)
    cvGoodFeaturesToTrack(prev_grey, eig, temp, points[0], &count,quality, min_distance, 0, 3, 0, 0 ); 
        
    // refines search for features obtained by routines like cvGoodFeaturesToTrack
    // void cvFindCornerSubPix(CvArr* image,CvPoint2D32f* corners, int count,      
    //CvSize win, CvSize zero_zone, CvTermCriteria  criteria)
    cvFindCornerSubPix( prev_grey, points[0], count, cvSize(win_size,win_size), cvSize(-1,-1),
                cvTermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS,20,0.03));
 
    cvReleaseImage( &eig );
    cvReleaseImage( &temp );
 
    // loop draws found features onto the first frame 
    for (int i = 0; i < count; i++)
    {
       CvPoint p1 = cvPoint( cvRound( points[0][i].x ), cvRound( points[0][i].y ) );
       cvCircle(image,p1,1,CV_RGB(255,0,0),1,8,0);
    }
                
    cvShowImage("first frame",image);
 
    // needed for cvCalcOpticalFlowPyrLK
    status = (char*)cvAlloc(MAX_COUNT);
 
    // loop, which is stopped by setting this->my_on_off = false 
    while(this->my_on_off == true && frame_counter <= to) { 
        //............    
 
        // calculates optical flow  
        //void cvCalcOpticalFlowPyrLK(const CvArr* imgA, const CvArr* imgB, CvArr* pyrA, 
        // CvArr* pyrB,CvPoint2D32f* featuresA,CvPoint2D32f* featuresB, int count, 
        // CvSize winSize, int level, char* status,
        // float* track_error,CvTermCriteria criteria, int flags);         
        cvCalcOpticalFlowPyrLK( prev_grey, grey, prev_pyramid, pyramid, points[0], points[1], count, 
                    cvSize(win_size,win_size), 5, status, 0,
                    cvTermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS,20,0.03), flags );
            
 
        // depicts flow by drawing lines between successive frames
        for( int i=0; i < count; i++ ){
            CvPoint p0 = cvPoint( cvRound( points[0][i].x ), cvRound( points[0][i].y ) );
            CvPoint p1 = cvPoint( cvRound( points[1][i].x ), cvRound( points[1][i].y ) );
 
            // draws circles onto prev_grey image and grey image
            cvCircle(prev_grey,p0,1,CV_RGB(0,255,0),1,8,0);
            cvCircle(grey,p1,1,CV_RGB(0,255,0),1,8,0);
 
            // connects found features of successive frames 
            cvLine( image, p0, p1, CV_RGB(255,0,0), 2 );
        }
                 
 
        //............            
               
        // grey image becomes new previous grey image for next step in loop
        prev_grey = cvCloneImage(grey);
        // similar operations as above, positions of pyramids and points-arrays are swapped
        CV_SWAP( prev_pyramid, pyramid, swap_temp );
        CV_SWAP( points[0], points[1], swap_points );
        
        //............     
 
        // delays loop
        key = cvWaitKey(40);
    }// end of while loop
 
    //............
}

Points of Interest

If you want to see how all this looks like in a real application, please download the UsingOpenCV.exe file, install the necessary DLLs (see UsingOpenCV.cpp), load the example video, and see what happens when applying the different commands. The example video is very simple. It produces little noise and thus ‘beautiful’ results, but it perfectly does what it has been created for. It gives a good impression of how the code works.

There is no guarantee that the examples are free of bugs and there are parts that surely can be ironed out. I also sometimes defined unnecessary variables, in order to provide more clarity (hopefully).