Click here to Skip to main content
Click here to Skip to main content

Real-Time Object Tracker in C++

, 18 Dec 2007
Rate this:
Please Sign up or sign in to vote.
This article describes an object tracking approach by estimating a time averaged background scene for tracking static and moving objects in real-time on down-scaled image data.

Screenshot - blobs.jpg

Contents

Introduction

This article is about tracking moving or static objects with a conventional web cam at real-time speed. A simple way of tracking is to compare a predefined background image with the same background frame when objects start to appear. This scenario is applicable in static surroundings where you can learn the background image without any foreground articles you'd expect to track (e.g., indoors, landings, offices, shops, warehouses etc...). In the case that an article moves and remains static for a prolonged period of time in the scene, you may consider it as becoming the part of the scene now, and superimpose its image to the background to avoid further detections. The same is applicable if some background item is removed from the scene. The drawbacks of this approach include illumination or cam position change. In that case, you will have to estimate the new background again. Another solution is to use edge operators to avoid illumination specific changes. But you will need additional code to devise if you'd like to identify the object's body and not just its outline.

The conventional web cam 640x480 resolution is far redundant. You may downscale it about three times to remove the noise usually present and significantly boost the speed of processing without essential loss of tracking abilities, unless you expect to track very minute objects. The downscaling step allows to achieve great processing speeds in object tracking. It runs about 100fps on a 2Ghz single core when no objects are present, and from 30 to 90fps when there are. The bigger the object, the more time is required to estimate all pixels belonging to the blob.

Background

The article and code are based on my previous submissions, except the ImageBlobs class. You should have a look at the following articles if you have questions about the GUI interface or particular code fragments:

The GUI is targeted at 640x480 web cams, and if you want to use it for different resolutions, have a look at the Face Detection article to add the required changes. You can not expect me to provide you a complete application for every custom video device.

Using the code

Setup the camera and frames capture rate as described in Video Preview and Frames Capture to Memory with SampleGrabber in Buffered Mode and start video capture. Click the background radio box to start the background estimation process first. All the captured frames will be added together, and the mean background frame will be estimated and saved to a JPEG file, background.jpg, once you have clicked the start tracking radio box to start object tracking. I suggest you point your camera to some place where you can expect moving objects. Estimate the background for about several seconds to cope with camera noise or some casual moving articles appearing in the scene for short periods of time. The mean background estimate will effectively remove them. Now, introduce the objects to the scene you want to track, or wait for them to appear there if they are alive.

The library

The mean estimate of the background frame is advisable as it allows to filter out any noise or tiny movements.

1: background[i] = 0;
   frames_number = 0;
2: while(frames_number < N)
       background[i] = current_frame[i] + background[i];
       frames_number++;
3: background[i] = background[i] / frames_number;

The new classes in the library are:

  • MotionDetector
  • ImageBlobs

The changes in the MotionDetector allow to set the background frame to which the comparison will be done with every new image frame, rather than taking the difference between consecutive frames as in the Face Detection article. The background frame and the new image frame are presented with ImageResize objects.

First, you need to initialize the MotionDetector object with the video frame width, height, and downscaling factor:

  • void MotionDetector::init(unsigned int image_width, unsigned int image_height, float zoom);

To downscale the image three times, use zoom = 0.125f.

Invoke the set_background() function once you finish the background estimation process:

inline int MotionDetector::set_background(const unsigned char* pBGR)
{        
    return m_background.resize(pBGR);
}

Now, you may call the detect() function to estimate the pixels belonging to the foreground objects:

const vec2Dc* MotionDetector::detect(const unsigned char* pBGR)
{
    if (status() < 0)
            return 0;
    
    m_image.resize(pBGR);

    //RGB version
    char** r1 = m_image.getr();
    char** g1 = m_image.getg();
    char** b1 = m_image.getb();
    char** r2 = m_background.getr();
    char** g2 = m_background.getg();
    char** b2 = m_background.getb();
    for (unsigned int y = 0; y < m_motion_vector->height(); y++) {
            for (unsigned int x = 0; x < m_motion_vector->width(); x++) {
                    if (abs(r1[y][x] - r2[y][x]) > m_TH ||
                        abs(g1[y][x] - g2[y][x]) > m_TH ||
                        abs(b1[y][x] - b2[y][x]) > m_TH)
                            (*m_motion_vector)(y, x) = 1;
                    else
                            (*m_motion_vector)(y, x) = 0;
            }
    }

    //gray scale version
    /*m_difference_vector->sub(*m_image.gety(), *m_background.gety());
    for (unsigned int y = 0; y < m_motion_vector->height(); y++) {
            for (unsigned int x = 0; x < m_motion_vector->width(); x++) {
                    if (fabs((*m_difference_vector)(y, x)) > m_TH)
                            (*m_motion_vector)(y, x) = 1;
                    else
                            (*m_motion_vector)(y, x) = 0;
            }
    }*/

    m_tmp_motion_vector->dilate(*m_motion_vector, 3, 3);
    m_motion_vector->erode(*m_tmp_motion_vector, 5, 5);
    m_tmp_motion_vector->dilate(*m_motion_vector, 3, 3);

    return m_tmp_motion_vector;
}

You may use either RGB values comparison or gray image data. The latter, I found not robust for similar looking colors when converted to gray values. The dilation and erosion operators allow to fill any gaps in the object's blobs that might appear due to similar pixel colors of the background and the object, and remove the noise of some tiny movements or very thin objects.

The returned vector is used for blobs extraction with the ImageBlobs object. You will need the following functions to use the class:

  • void ImageBlobs::init(unsigned int width, unsigned int height);
  • int ImageBlobs::find_blobs(const vec2Dc& image, unsigned int min_elements_per_blob = 0);
  • void ImageBlobs::find_bounding_boxes();
  • void ImageBlobs::delete_blobs();

First, you need to initialize the object to the downscaled image width and height (e.g., init(zoom * image_width, zoom * image_height)). Then, you may proceed estimating blobs with find_blobs() on the image returned from the MotionDetector::detect() function call. The function searches for non-zero elements in the image vector adjoining horizontally or vertically, forming a blob. Then, it marks every element with the current found blob number.

For example, the 10x10 vector used as image vector:

1 1 1 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0
0 1 0 0 0 1 1 1 1 0
0 0 0 0 1 1 1 1 1 0
1 1 0 1 1 1 1 1 1 1
1 1 0 1 1 1 1 1 1 1
1 0 1 1 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1 1

find_blobs() will estimate three blobs from that image. You may get the vector containing the found blobs with the const vec2Dc* ImageBlobs::get_image() const function.

1 1 1 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0
0 1 0 0 0 2 2 2 2 0
0 0 0 0 2 2 2 2 2 0
3 3 0 2 2 2 2 2 2 2
3 3 0 2 2 2 2 2 2 2
3 0 2 2 2 2 2 2 2 2
0 0 0 2 2 2 2 2 2 2

With min_elements_per_blob, you may discard small blobs from being detected (e.g., min_elements_per_blob = 5 will leave the third blob undetected).

To access the elements of the found blobs, you may use the following functions:

  • inline unsigned int ImageBlobs::get_blobs_number() const;
  • inline const struct Blob* ImageBlobs::get_blob(unsigned int i) const;

The blob is returned in the Blob structure:

struct Blob {
    unsigned int elements_number;
    vector<struct Element> elements;
    unsigned int area;
    RECT bounding_box;      //[top,left; right,bottom)    
};

where elements_number is the number of elements in the blob contained in the elements array of Element structures.

struct Element {
    vector<struct Element> neighbs;
    struct Coord coord;        
};

The neighbs contain the directly adjoining neighboring elements, and coord is the element coordinate in the image vector.

struct Coord {
    int x;
    int y;
};

After you call find_blobs(), you may optionally invoke find_bounding_boxes() to estimate the bounding boxes of all blobs found to Blobs::bounding_box window's RECT structure. Before the next call to find_blobs(), you need to delete the found blobs from the ImageBlobs object with delete_blobs().

The find_blobs() function is shown below:

int ImageBlobs::find_blobs(const vec2Dc& image, 
                           unsigned int min_elements_per_blob)
{
        if (m_image == 0)
        //not initialized
            return -1;
        
        m_image->copy(image);        

        while (true) {
            struct Blob blob; 
            blob.elements_number = 0;
            blob.area = 0;

            unsigned int y, x;
            //find first non-zero entry//////////////////////////////////
            for (y = 0; y < m_image->height(); y++) {
                for (x = 0; x < m_image->width(); x++) {
                    if ((*m_image)(y, x) != 0) {
                        struct Element element;
                        element.coord.x = x;
                        element.coord.y = y;
                        blob.elements_number = 1;
                        blob.elements.push_back(element);
                        blob.area = 0;
                        memset(&blob.bounding_box, 0, sizeof(RECT));
                        break;
                    }
                }
                if (blob.elements_number > 0)
                        break;
            }

            if (blob.elements_number == 0) {
                mark_blobs_on_image();
                return get_blobs_number();
            }

            blob.elements.reserve(m_image->width() * m_image->height());
            //find blob//////////////////////////////////////////////////
            unsigned int index = 0;
            while (index < blob.elements_number) {
                unsigned int N = (unsigned int)blob.elements_number;
                for (unsigned int i = index; i < N; i++) {
                    add_up_neighbour(blob, i);
                    add_right_neighbour(blob, i);
                    add_down_neighbour(blob, i);
                    add_left_neighbour(blob, i);
                }
                index = N;
            }
            remove_blob_from_image(blob);

            if (blob.elements_number > min_elements_per_blob) {
                blob.area = (unsigned int)blob.elements_number;
                blob.elements.reserve(blob.elements_number);
                m_blobs.push_back(blob);
            }                
    }        
}

The add_*_neighbour() functions check for the directly adjoining image element from above, below, left, or right from the ith Element in the current blob and adds it to the Blob elements array:

inline unsigned int ImageBlobs::add_up_neighbour(struct Blob& blob, unsigned int i)
{
    const struct Element& element = blob.elements[i];
    if (element.coord.y - 1 < 0)
        return 0;
    else if ((*m_image)(element.coord.y - 1, element.coord.x) > 0) {
        struct Element new_element;
        new_element.coord.x = element.coord.x;
        new_element.coord.y = element.coord.y - 1;
        if (has_neighbour(element, new_element) == false) { 
            int index = is_element_present(blob, new_element);
            if (index >= 0) {
                blob.elements[index].neighbs.push_back(element);
                return 0;
            }
            new_element.neighbs.push_back(element);
            blob.elements_number++;
            blob.elements.push_back(new_element);
            return 1;
        }
        else
                return 0;
    }
    else 
        return 0;
}

has_neighbour() and is_element_present determine if the new element is already present in the blob:

inline int ImageBlobs::is_element_present(const struct Blob& blob, 
           const struct Element& new_element) const
{
    //int index = 0;
    for(int i = (int)blob.elements_number - 1; i >= 0; i--) {
        const struct Element& element = blob.elements[i];
        if (element.coord.x == new_element.coord.x &&
            element.coord.y == new_element.coord.y) {
                //wprintf(L" %d\n", blob.elements_number - 1 - i);
                return i;
        }
        //if (++index > 2)  //inspect at least 2 last elements
        //        break;
    }
    return -1;
}

inline bool ImageBlobs::has_neighbour(const struct Element& element, 
            const struct Element& new_element) const
{
    unsigned int N = (unsigned int)element.neighbs.size();
    if (N > 0) {
        for (unsigned int i = 0; i < N; i++) {
            if (element.neighbs[i].coord.x == new_element.coord.x &&
                element.neighbs[i].coord.y == new_element.coord.y) 
                return true;
        }
        return false;
    }
    else
        return false;
}

Results

Here, I have selected a static background and performed some object tracking experiments with the objects at hand. The background I used is depicted below. It was averaged over a period of several seconds.

Background

The next two objects (an LCD-TFT screen cleaner and mobile phone) are tracked at 66.67fps.

Cleaner and mobile. 66.67fps

The mobile phone power adaptor is tracked at 71.43fps. You see the cord is not detected due to erosion and dilation operators, as it is quite thin.

Power adaptor. 71.43fps

Now, some pens and keys: 90.91fps.

Pens and keys. 90.91fps

A more complex scenario: three different objects detected at 29.41fps.

3 objects. 29.41fps

Jimi Hendrix (the Are You Experienced album, very good sound) and 9v batteries at 55.56fps.

Hendrix with batteries. 55.56fps

The other background setup:

Background

Several 9v batteries and a roll on the carpet detected at 90.91fps. You see one battery is left undetected due to the minimum number of elements limit.

Batteries and a roll. 90.91fps

Power adaptor, LCD-TFT cleaner, and 9v batteries detected at 90.91fps. The tiny ones are left undetected again, but this time due to erosion and dilation operators.

A lot of objects. 90.91fps

Next, some candle and mobiles at 83.33fps.

Candle with mobiles. 83.33fps

Now, as you can see, the bigger the size of the object, the more time is required to estimate the elements of the blobs. Also, it does not handle the shadows cast by objects on the white back wall and on the table. However, more pronounced shadows will be tracked as pertaining to the object.

Points of interest

You may extend the algorithm to monitor an object's bounding box positions. In case they will be static for a specified amount of time, you may add that region to the background scene image. Thus, the object becomes part of the scene (e.g., some object moved to the scene and remaining static).

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

About the Author

Chesnokov Yuriy
Engineer
Russian Federation Russian Federation
No Biography provided

Comments and Discussions

 
QuestionHow to run the code PinmemberVerghese689129-Mar-13 10:06 
GeneralMy vote of 5 PinmemberMichael Haephrati מיכאל האפרתי4-Dec-12 2:29 
QuestionTracker Pinmemberpredatorxxx2-Jul-12 2:40 
Questionc:\users\icould\documents\visual studio 2008\projects\src\stdafx.h(38) PinmemberMember 867845724-Mar-12 23:59 
QuestionPut your two projects together. Error? Pinmemberlinuszhao15-Aug-11 1:59 
AnswerRe: Put your two projects together. Error? PinmemberChesnokov Yuriy15-Aug-11 9:43 
GeneralITRe: Put your two projects together. Error? [modified] Pinmemberlinuszhao15-Aug-11 19:52 
Questionqedit.h is missing? Pinmemberzhaodongff15-Mar-11 4:18 
AnswerRe: qedit.h is missing? PinmemberChesnokov Yuriy15-Aug-11 9:45 
QuestionCouldn't reder the video capture stream. hr=0x80070057 Pinmembermengnancy26-Oct-10 23:47 
GeneralQuestion about your library PinmemberSprite19866-Jun-10 14:26 
AnswerRe: Question about your library PinmemberChesnokov Yuriy6-Jun-10 18:37 
QuestionObject detection algorithm? Pinmembermor8322-Jun-09 2:15 
AnswerRe: Object detection algorithm? PinmemberChesnokov Yuriy23-Jun-09 2:55 
Questionvideo source device Pinmemberwishuang4-Sep-08 12:07 
Generalunstable blob Pinmemberjoefree13-Jul-08 12:16 
AnswerRe: unstable blob PinmvpChesnokov Yuriy21-Jul-08 19:24 
Questiondilate and erode? Pinmemberhadasiah10-Jul-08 20:15 
AnswerRe: dilate and erode? PinmvpChesnokov Yuriy10-Jul-08 20:20 
Questionrbg2y() and sub128()?? Pinmemberhadasiah10-Jul-08 20:10 
AnswerRe: rbg2y() and sub128()?? PinmvpChesnokov Yuriy10-Jul-08 20:23 
Generaldetecting specific patternd object.. PinmemberMember 426087910-Jul-08 17:59 
AnswerRe: detecting specific patternd object.. PinmvpChesnokov Yuriy10-Jul-08 19:17 
GeneralRe: detecting specific patternd object.. PinmemberMember 426087911-Jul-08 17:46 
AnswerRe: detecting specific patternd object.. PinmvpChesnokov Yuriy21-Jul-08 19:22 
GeneralRe: detecting specific patternd object.. Pinmemberdevilrock3-Aug-08 19:27 
GeneralRe: detecting specific patternd object.. PinmemberMember 42608793-Aug-08 19:30 
QuestionWhy the Big image can not show in my computer? Pinmembergoldany16-Jun-08 18:00 
AnswerRe: Why the Big image can not show in my computer? PinmvpChesnokov Yuriy16-Jun-08 19:09 
GeneralRe: Why the Big image can not show in my computer? Pinmembergoldany16-Jun-08 21:04 
QuestionHow to??? Pinmemberprojectip18-May-08 23:22 
AnswerRe: How to??? PinmvpChesnokov Yuriy26-May-08 21:42 
GeneralRe: How to??? Pinmemberprojectip28-May-08 0:35 
GeneralRe: How to??? Pinmemberhernyho0822-Aug-08 20:08 
GeneralRe: How to??? PinmvpChesnokov Yuriy27-Aug-08 21:33 
GeneralHi mr Chesnokov Pinmemberarik074-May-08 19:52 
AnswerRe: Hi mr Chesnokov PinmvpChesnokov Yuriy26-May-08 21:45 
GeneralRe: Hi mr Chesnokov Pinmemberarik0727-May-08 2:19 
GeneralRe: Hi mr Chesnokov Pinmemberarik0727-May-08 3:31 
Generalsome question .... Pinmemberhadasiah2-May-08 20:04 
AnswerRe: some question .... PinmvpChesnokov Yuriy26-May-08 21:40 
QuestionHow to increase the display speed of the window with GDI+(msgWindow)? Pinmemberpumaamup122-Apr-08 12:28 
Generalplease flowchart or algorithm Pinmemberjoefree18-Apr-08 3:01 
AnswerRe: please flowchart or algorithm PinmvpChesnokov Yuriy26-May-08 21:37 
GeneralUnmanaged Code Pinmemberarmage216-Apr-08 7:18 
GeneralRe: Unmanaged Code PinmvpChesnokov Yuriy26-May-08 21:31 
GeneralRe: Unmanaged Code Pinmemberarmage227-May-08 2:01 
Generalsuper resolution Pinmemberjleni7-Apr-08 11:23 
Generalcompilation problem PinmemberIgorBer3-Apr-08 14:58 
AnswerRe: compilation problem PinmvpChesnokov Yuriy26-May-08 21:28 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140721.1 | Last Updated 19 Dec 2007
Article Copyright 2007 by Chesnokov Yuriy
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid