An introduction to OpenCV (Part II): Implementing mouse events, manipulating images, and creating video clips

Markus Koppensteiner

4.58/5 (9 votes)

Jan 11, 2013

CPOL

14 min read

79184

2131

Shows how to use OpenCV to write videos, how to implement mouse events, and presents some commands on image manipulation as well.

Download source - 24.1 KB

Introduction

The contents presented here build on a previous article (see Introduction to OpenCV: Playing and Manipulating Video files click here), but I tried to organize them in a way, so that they can be understood without reading part one of this tutorial. Alas, the current article might even be simpler than its predecessor and possibly would have been a better part one. Still, I am very optimistic that the interested reader will not be puzzled that much and consider this as a minor problem.

As already indicated by the title the article mainly concerns the Open Source Computer Vision Library (OpenCV), which is a software platform that provides a great number of high level programming tools for loading, saving, depicting and manipulating images and videos. It is, of course, impossible to give a description of every aspect that is offered by this library in a brief tutorial like this, therefore I will only discuss a small selection of topics. If you are in need of a more extensive overview, if you are interested in mathematical backgrounds or if you are looking for more details on the topics touched on here, you might find the answers you are yearning for in a book on OpenCV like, for example, “Learning OpenCV” by Gary Bradski and Adrian Kaehler, where partly have my “wisdom” from.

The OpenCV function calls are pure C code, but I packed them into two C++ classes (one for image operations, and one for video operations inheriting the methods of the image class). Some people might consider this mixture of programming conventions as a faux pas, I, however, regard it as a good way to keep the code tidy and have most of it at a particular location. Because the OpenCV library only offers a limited set of possibilities to create a graphical interface I built a Windows GUI around the OpenCV code (by creating a Win32 program in Visual C++ 2010 Express Edition).

If you want to turn the presented code into an executable program, you will have to install the OpenCV libraries (get the latest version at e.g. http://opencv.willowgarage.com) and include the correct lib files and headers for your program. I give a stepwise description on all this on top of the UsingOpenCV.cpp source code file. Graphical illustrations can be very helpful in finding one’s way through this and ,luckily, further explanations, which offer screenshots also, can be found on the web (e.g. by searching for ‘Using OpenCV in VC++2010’).

In the first part of this tutorial I will give a basic description of how to handle events (in particular mouse events) in OpenCV, then there will be some words on a selection of OpenCV commands which can be used to manipulate images, and finally I will show how to turn visual input into video formats that can be read by most standard video players. I am not an expert in the mathematical details of the presented contents. What I can give is a more or less superficial overview of how the presented functions work and how they can be used in a program.

Processing mouse events with the help of OpenCV

If you press a key or a button on the computer-mouse or simply move the mouse pointer over a window, the message loop in a Win32 program processes such events and makes them accessible to the programmer. For those who are experienced in developing interactive software handling such messages is daily business. Since OpenCV has its own commands to create windows, it also offers its own procedures to set up message loops for such windows. In most cases the OpenCV specific code for event handling is easy to implement and should be preferred to processing messages coming from OpenCV windows via the standard message loop (something I have not tried out, but surely can be done). The code below concerns mouse events, but for the sake of completeness I also briefly discuss the handling of keyboard input.

Keyboard events can be processed very easily. The command cvWaitKey(timespan); waits a certain time for a key to be pressed and returns this key as an integer value (ASCII code). Therefore, if you want to process keyboard events set up a while loop, insert the command line key = cvWaitKey(timespan); and then process keyboard input by checking the value of the variable key like, for example, if(key == ‘q’){ do something}. Please note, that cvWaitKey(); also makes the program wait for a specified period of time (e.g. cvWaitKey(100) makes the program wait for 100 milliseconds). This is necessary, for example, to process images at a certain frame rate (find examples in the previous article and below where I discuss the code on how to save a video).

Mouse events require more attention, although they are not very difficult to implement as well. They consist of two parts:

First, you will have to invoke cvSetMouseCallback (const char* window_name, CvMouseCallback my_Mouse_Handler, void* param); in order to register a callback. The first argument of this function is the name of the window to which the callback is attached (a window created with cvNamedWindow(“window name”,0);). The second argument is the callback function itself, and the third argument, for instance, the image to which the callback is applied.

Afterwards, you have to set up a mouse handler function (second argument in cvSetMouseCallback). This function, which I named my_mouse_Handler (int events, int x, int y, int flags, void* param) in my program, takes 5 arguments.

The first and most important argument is an integer variable that can have one of the following values (ranging from 0 to 9): CV_EVENT_MOUSE_MOVE (= mouse pointer moves over specified window), CV_EVENT_LBUTTONDOWN (= left mouse button is pressed), CV_EVENT_RBUTTONDOWN (= right mouse button is pressed), CV_EVENT_MBUTTONDOWN (= middle mouse button is pressed), CV_EVENT_LBUTTONUP, CV_EVENT_RBUTTONUP, CV_EVENT_MBUTTONUP (= events that occur after one of the corresponding button has been released), CV_LBUTTONDBLCLK, CV_LBUTTONDBLCLK, and CV_LBUTTONDBLCLK (= when a user double clicks the corresponding buttons).

The second and the third argument of the callback function are the x (= horizontal) and the y (= vertical) position of the mouse-pointer with the upper left corner of a window being the reference point (0,0).

The forth argument is useful if you want to access additional information during a mouse event. CV_EVENT_FLAG_LBUTTON, CV_EVENT_FLAG_RBUTTON, CV_EVENT_MBUTTON check if the user presses one of the corresponding buttons. This might be needed if you want to know if a button is pressed while the mouse pointer moves (e.g., drag and drop operations). CV_EVENT_FLAG_CTRLKEY, CV_EVENT_FLAG_SHIFTKEY, CV_EVENT_FLAG_ALTKEY check if the Ctrl, Shift, or the Alt key has been pressed during a mouse event.

The final argument is a void pointer for any additional information that will be needed. In the code example below I use this argument to obtain a pointer to the image the event handler is operating on.

Using the code

As already mentioned above I packed the OpenCV specific code into two classes. The first class contains some methods on image operations and the mouse-handler. The second one inherits the methods of the image class but also contains code for processing videos. Please note, that within a class a callback function and its variables have to be defined as static.

The program I wrote works on videos. It provides access to the video data; then it loads the first frame of the video and presents it in a window of its own. Mouse operations for which the program implements a handler are done on the image shown in this window.

The most important steps are:

Capture video file with cvCreateFileCapture(); by invoking Get_Video_from_File(char* file_name); which I defined in the Video_OP class. Please find the contents of this method in the following code sample. It also should give you a “feeling” of how to use some OpenCV commands (like cvNamedWindow();, for instance).

//
// code skeleton of Video_OP::Get_Video_from_File(char*) method 
//
// methods needs file pathname as char string
bool Video_OP::Get_Video_from_File(char* file_name)
{
   // checks if filename is available
   if(!file_name)
     return false;
     
   // OpenCV command for capturing video files; returns pointer to CvCapture structure
   // (defined in the variables section of the class)
   my_p_capture = cvCreateFileCapture(file_name);
 
   // checks if capturing was successful; e.g. fails if required video codec 
   // is not installed on machine
   if (!my_p_capture) return false;
               
   // gets first frame (= IplImage*) of video for accessing video properties  
   this->my_grabbed_frame = cvQueryFrame(my_p_capture); 
               
   // gets width and height of frame loaded with cvQueryFrame()
   // assigns these properties to variable of the type CvSize
   this->captured_size.width = (int)cvGetCaptureProperty(my_p_capture, 
         CV_CAP_PROP_FRAME_WIDTH);
   this->captured_size.height = (int)cvGetCaptureProperty(my_p_capture, 
         CV_CAP_PROP_FRAME_HEIGHT);
        
   // creates window in which first frame (= my_grabbed_frame) will be displayed
   // window is named “choose area”
 
   cvNamedWindow(“choose area”,CV_WINDOW_AUTOSIZE)
 
   // displays first frame in Window “choose area”
   cvShowImage(“choose area”,my_grabbed_frame);
 
   // gets the video’s number of frames
   this->my_total_frame = (int) cvGetCaptureProperty(my_p_capture,CV_CAP_PROP_FRAME_COUNT);
 
   // sets up the mouse callback; method below invokes cvSetMouseCallback(char*,CvMouseCallback,void*);
   this->Set_Mouse_Callback_for_Image(this->my_grabbed_frame); 
   return true;           
               
}

After capturing a frame and setting up the mouse callback, mouse events can be processed in the my_Mouse_Handler(); function of the program’s Image_OP class. The following code sample does not give a description of all possible mouse events. It only presents mouse events that are needed to draw a rectangle onto an image. Please note that the static variables for the static method my_Mouse_Handler(); have to be defined outside the class.

//
// code skeleton of the static method Image_OP::my_MouseHandler() 
//
// see above for a description of the callback function’s parameters
 
void Image_OP::my_Mouse_Handler(int events, int x, int y, int flags, void* param)
{
    IplImage *img_orig;
 
 // Operations are mostly done on a cloned image, in order
 // to restore original settings, if operations need to be repeated
 
    IplImage *img_clone;
 
 // param is used for getting access to the image, for which 
 // the mouse callback had been implemented 
   
    img_orig = (IplImage*) param;
 
    int x_ROI =0, y_ROI =0 , wi_ROI =0, he_ROI =0; 
 
 
switch(events)
 {
   // user presses left button somewhere within the image 
   // ( = first frame of video in this case)
   case CV_EVENT_LBUTTONDOWN:
    {
      // saves mouse pointer coordinates (x,y) send by button pressed message
       // in static variable (CvPoint) 
        my_point = cvPoint(x, y);
    }
    break;
   
   // event, when mouse moves over image
  case CV_EVENT_MOUSEMOVE:
    {
      // mouse pointer moves over specified window with left button pressed
        if(flags == CV_EVENT_FLAG_LBUTTON )
           {
             // makes a copy of original image 
               img_clone = cvCloneImage(img_orig);
            
             // draws green [see CV_RGB(red,green,blue)=> single values
              // ranging from 0 -255] rectangle
              // onto cloned image using point coordinates 
              // from CV_EVENT_LBUTTONDOWN as one corner
              // and coordinates retrieved here 
              // as the other corner; here: 1 = thickness; 
              // 8 = line_type; 0 = shift;
 
               cvRectangle(img_clone, my_point,cvPoint(x,y),
                  CV_RGB(0,255,0),1,8,0);
            // shows cloned image with rectangle drawn on it
                 cvShowImage(“choose area”,img_clone);
             }
     }
     break;
 
   // user releases left button
 
   case CV_EVENT_LBUTTONUP:
     {
      // clones original image again
        img_clone = cvCloneImage(img_orig);
 
     // checks position of starting point
     // stored in my_point (see CV_EVENT_LBUTTONWDOWN)
     // in relation to end point, in order
     // to avoid negative values
 
        if(my_point.x > x)
        {
          x_ROI = x;
          wi_ROI = my_point.x - x;
        }
        else 
        {
          x_ROI = my_point.x;
          wi_ROI = x - my_point.x;
        }
 
        if(my_point.y > y)
        {
          y_ROI = y;
          he_ROI = my_point.y - y;
        }
        else 
        {
          y_ROI = my_point.y;
          he_ROI = y - my_point.y;
        }
 
        // stores coordinates of Region of Interest
         // in static variable my_ROI
         my_ROI.x = x_ROI;
         my_ROI.y = y_ROI;
         my_ROI.width = wi_ROI;
         my_ROI.height = he_ROI;
 
       // selects region of interest (= ROI) in cloned image; 
        // needed for cvNot operation
        // cvRect function requires upper, left corner 
        // of ROI, its width and its height 
          cvSetImageROI(img_clone,cvRect(x_ROI,
             y_ROI,wi_ROI, he_ROI));
 
     // inverts color information of image, in order
      // to make selected area clearly visible; in this case
      // source (first argument) and destination are the same
 
         cvNot(img_clone, img_clone);  
 
     // “turns off“ region of interest
 
         cvResetImageROI(img_clone);
 
     // shows cloned image in window
 
         cvShowImage(“choose area”, img_clone);
     }
     break;
  
  } // end of switch
}

A selection of OpenCV functions for processing images

The second part of this tutorial mainly concerns some (mostly) simple OpenCV commands to process images. When using sophisticated methodologies (like optical flow; see first part of this tutorial) to detect or trace motion, it often provides better results “smoothing” images (= or processed frames) first, in order to iron out outliers produced by noise and camera artifacts.

OpenCV offers five different basic smoothing operations, which can be invoked by the command cvSmooth(IplImage* source, IplImage* destination, int smooth_type, int param1 = 3, int param2 = 0, double param3 = 0, double param4 =0);. I think it is clear that the first two arguments represent the input and the output image. More interesting is the third parameter that serves as a placeholder for one of five different values (which also determine the meaning of the parameters param1 to param4). In the following part I give an overview of the possible values for parameter three. If you are in need for more details, please, consult a book (like ‘Learning OpenCV’ by Gary Bradski and Adrian Kaehler) or an expert article on this.

The smooth_type CV_BLUR, for instance, calculates the mean color values of all pixels within an area around a central pixel (area specified by param1 and param2).

CV_BLUR_NO_SCALE does the same as CV_BLUR but there is no division to create an average.

CV_MEDIAN performs a similar operation with the only exception that it calculates the median value over the specified area.

CV_GAUSSIAN is more complicated and does smoothing operations based on the Gaussian function (= normal distribution). param1 and param2 again define the area to which the algorithm is applied. param3 is the sigma value of the Gaussian function (will be calculated automatically if not specified) and if a value for param4 will be given there will be a different sigma value in horizontal (= param3 in this case) and in vertical direction.

CV_BILATERAL is similar to the Gaussian smoothing, but weights more similar pixels more highly than less similar ones.

In the code samples that are part of this article, only one of the above “smoothing” functions is implemented. The method Blur(int square_size, IplImage*, IplImage*) (see image on top of page) of the Image_OP class carries out a simple blur based on the mean of a square area of pixels. How changing the size of this area affects the “blur” can be demonstrated by compiling the source code that comes with this tutorial. Just load a movie, select the option button “Blur” and move the bar of the trackbar control. Attention: If you intend to use other types of smoothing functions (like CV_GAUSSIAN) problems might occur, because such functions do not accept all values that are returned by the trackbar.

Dilate and Erode

Another way of removing noise from an image, but isolating or joining disparate regions as well, is based on dilation and erosion. For both kinds of transformations OpenCV offers corresponding functions (cvDilate() and cvErode()). These functions have a kernel (a small square or circle with an anchor point in the center of this area) running over an image. While this happens, the maximal (=dilation) or minimal (=erosion) pixel value of the kernel is computed and the pixel of the image under the anchor point is replaced by this maximum or minimum.

Because both functions perform similar tasks, they take the same arguments. For this reason I will discuss them for cvErode(IplImage* src, IplImage* dest,IplConvKernel* kernel = NULL, int iterations = 1); only. The first two arguments are the source- and the destination image, the third argument is a pointer to an IplConvKernel structure, and the last argument is the number of iterations performed by the algorithm. Creating your own kernel using the IplConvKernel structure will not be discussed here, for this reason the standard (3x3 square kernel) kernel will be used.

Again, both functions are implemented as methods of the Image_OP class and linked to the behavior of the main window’s trackbar control. Just load a video and click on the option button Erode (or on the button Dilate). Moving the bar of the trackbar will then change the parameter iter (= iterations) of the Image_OP::Erode() or the Image_OP::Dilate() method. Depending on which of the two options you have chosen, the images will show expanded bright regions or expanded dark regions.

Drawing contours

In this section I present some code that is able to extract the contours of images. In OpenCV, contours are represented as sequences of points that form a curve. To filter these point locations, OpenCV provides the function cvFindContours(IplImage*, CvMemStorage*, CvSeq**,int headerSize,CvContourRetrievalMode,CvChainApproxMethod).

The first argument should be an 8-bit single channel image that will be interpreted as a binary image (all nonzero pixels are 1). The second argument is a linked list of memory blocks that is used to handle dynamic memory allocation. The third argument represents a pointer to the linked list in which the found points (contours) are stored.

The next arguments are optional and will not be discussed here in great detail, because they are not used in the code sample. The fourth argument can be simply set to sizeof(CvContour). The fifth argument encompasses four options: CV_RETR_EXTRENAL = extracts extreme outer contours; CV_RETR_LIST = is the standard option and extracts all contours; CV_RETR_CCOMP = extracts contours and organizes them in a two level hierarchy; CV_RETR_TREE = produces hierarchy of nested contours. The sixth argument determines how the contours are approximated (please look this up in a book on OpenCV).

The single step that needs to be carried out to display the contours of an image can be found in the method Image_OP::Draw_Contours() (see below). Similar to methods discussed before, one of the method’s arguments (here: first argument defining the threshold) is linked to the trackbar of the program’s main window.

Using the code

//
// code skeleton for drawing contours
//
void Image_OP::Draw_Contours(int threshold, IplImage* orig_image, IplImage* manipulated_img)
{
   //... omitted code
   // creates linked list of memory blocks
   CvMemStorage* mem_storage = cvCreateMemStorage(0);
          
   // defines pointer to a sequence of stored contours
   CvSeq* contours =0;
 
   // allocates memory for a gray_scale image
   IplImage* gray_img = cvCreateImage(cvSize(orig_img->width,orig_img->height)
               IPL_DEPTH_8U,1);
 
   int found_contours =0;
           
   //... omitted code  
   // creates window to display results of operations
   cvNamedWindow(“contours only”);
 
   // turns image into gray scale image
   cvCvtColor(orig_img, gray_img, CV_RGB2GRAY),
 
   // defines a threshold for operations;
   // uses binary information (only 0 and 1 as pixel values);
   // depending on the threshold type pixels will be 
   // set to 0, to the source value or to the max value 
   // here: CV_THRESH_BINARY => destination 
   // value = if source > threshold then MAX else 0
   // Parameters => 1) source- and 2) destination image
   // 3) threshold, 4) MAX value (255 in 8 bit grayscale) 5) threshold type 
 
   cvThreshold (gray_img, gray_img, threshold, 255, CV_THRESH_BINARY);
 
   // returns number of found contours
   // Parameters => 1) Image, that is used as space
   // to perform calculations 2) => stores recorded contours
   // 3) pointer to contours stored in memory
   // 4) final parameters are optional  
 
   found_contours = cvFindContours(gray_img, mem_storage, &contours);
 
    // sets all elements to NULL
 
    cvZero(gray_img);
 
    if(contours)
    {
       // draws contours: Parameters => Image to draw on,
        // 2) pointer to sequence where contours were stored
        // 3) color of contours, 4) color of contours marked 
        // as a hole; here: same color as other contours
        // 5) specifies how many contours of different 
        // levels are drawn; rest is optional and not used here
  
        cvDrawContours(gray_img,cvScalarAll(255),cvScalarAll(255),100);
    }
 
    //... omitted code
    cvShowImage(“contours only“, gray_img);
 
    // release memory
    cvReleaseImage(gray_img);
    cvReleaseMemStorage(&mem_storage);
}

Saving motion data as video file

Contents of this section are strongly linked to the contents presented in my previous tutorial on OpenCV. The basic structure for the code sample below can already be found there (see the Video_OP::Play_Video() method).

For this reason I keep the introduction to this topic very short. I just want to say some words on the FourCC notation, which was developed to identify data formats and is widely used to access AVI video codecs. The OpenCV macro CV_FOURCC provides this functionality and takes a four character code that denotes a particular codec (e.g., CV_FOURCC(’D’,’I’,’V’,’X’)) . A prerequisite for applying CV_FOURCC successfully is, of course, that the corresponding video codec is installed on the machine you are using.

Using the code

Capture video file by invoking this->Get_Video_from_File(char* file_name);
Invoke Video_OP::Write_Video(int from, int to, char* path); (see code below).
Create a video writer by invoking cvCreateVideoWriter(path, CV_FOURCC(’M’,’J’,’P’,’G’);
Set up loop to process successive frames (or images) of video file.
Grab frames by calling cvQueryFrame(CvCapture*);
Add frames (=images) to video file by calling cvWriteFrame(CvVideoWriter *,IplImage*);
Define delay of presentation by using cvWaitKey(int); (here: for demonstration purposes only)

//
// code skeleton for writing a video
//
void Video_OP::Write_Video(int from, int to, char* path)
 
{
    this->my_on_off = true;
    int key =0;
    int frame_counter = from;
 
// retrieves frames per second (fps); 
// is used to define speed of presentation
// and frame rate for video writer; see (A) & (B)
// here: the same as the input video
    int fps = this->Get_Frame_Rate();
               
 // creates window in which movie is displayed; 
    
    cvNamedWindow( "write to avi", CV_WINDOW_AUTOSIZE ); 
               
// sets pointer to position, where the video shall start from
    this->Go_to_Frame(from); 
               
// variable frame_counter stops video after “last” frame 
// (= to) has been grabbed
     int frame_counter = from;
 
 // creates cvVideoWriter; parameters: (1) filepath of video
 // (2) video codec name for AVI videofile format;
 // codec must be installed on the machine
 // (3) frames per second; (4) size of frames 
 // (5) optional(1 => color;0 => grayscale)
 
    CvVideoWriter *video_writer = cvCreateVideoWriter(path, 
       CV_FOURCC('M','J','P','G'),fps,size);
 
 // creates a loop, which is stopped after video reaches position of last frame (= to) 
 // or after my_on_off == false (see class method this->Stop_Video();)
    while(this->my_on_off == true && frame_counter <= to)
    { 
             
     // gets a frame; my_p_capture pointer is initialized 
     // in this->Get_Video_from_File() method
         this->my_grabbed_frame = cvQueryFrame(this->my_p_capture);
                  
    // check if frame is available
         if( !this->my_grabbed_frame ) break;
     
    // adds frame to video file 
        cvWriteFrame(video_writer,my_grabbed_frame);
                
    // displays grabbed image 
         cvShowImage( "write to avi" ,my_grabbed_frame);
         
    // keeps track of the frames already processed      
         frame_counter++;
                       
    // makes program wait until the time span of 1000/frame rate 
    // milliseconds has elapsed; see above (B)
         key = cvWaitKey(1000 /fps);
         if (key == ’q’) //breaks loop when ‘q’ is pressed
             break;
                 
    }
 
  //release memory  
    
    cvReleaseCapture( &my_p_capture ); 
    cvDestroyWindow( "write to avi");
    cvReleaseVideoWriter(&video_writer);
}
 
...

Additional points of interest

Most of the methods and operations that have been introduced here can be used in combination. This means that image operations that will be performed on the first frame of a video file can be confined to the region that has been selected with the mouse. In addition, these manipulations will be applied to all frames of a video if you click on the button ‘GO’ of the program’s main window.

There are methods in the source code files that have not been discussed here. For example, the Video_OP class contains a method that turns a movie into single images and a method that does quite the opposite, namely turning single images into a movie. If you try to do the latter you also find some code that demonstrates how to retrieve the files of a folder by invoking the Win32 API functions FindFirstFile() and FindNextFile().

OpenCV offers its own code to create a trackbar (or a slider) and to set up a message handler for it. I preferred to use the Win32 GUI trackbars instead, because it seemed more convenient to me. Still, you find some code in the source code files that shows how to use OpenCV’s own trackbar control. As a side issue the program and its source code files also demonstrate how buttons, sliders, textfields, and option buttons can be placed onto a window and used in a Win32 program.

There is no guarantee that the presented code is bug-free (and not all exceptions are handled), but I hope it is helpful for somebody who is looking for guidance on the topics discussed here.