Click here to Skip to main content
15,884,388 members
Articles / Desktop Programming / MFC
Article

Tracking an object from a live video input

Rate me:
Please Sign up or sign in to vote.
4.69/5 (45 votes)
6 May 20054 min read 251.6K   14.4K   163   52
Track an object based on its features, using the AVICap window class.

Sample screenshot

Introduction

As part of my research project, I had to implement a feature tracking device that runs entirely on a hardware board. Designing things, especially useful things on a piece of hardware, takes effort and time. To avoid any tedious calibrations of algorithms on board and to ensure the algorithms are all properly designed, I wrote a Windows application to simulate the environment - grabbing frames from a web camera and track. In exactly the same way as I have benefited from open source projects, I would certainly enjoy spending some time contributing to a site such as The Code Project.

AVICap

In this demo application, I have chosen to demonstrate the use of AVICap window class to track objects. AVICap is a window class that provides applications with an extremely convenient programming interface to access video acquisition hardware such as a web camera used in this demo application.

To be able to track objects from a live video input, we obviously need to gain access to individual frames. To gain access to individual frames before they are previewed, use the capSetCallbackOnFrame macro.

BOOL capSetCallbackOnFrame(HWND hwnd, FrameCallback fpProc);
  • HWND hwnd: Handle to the capture window.
  • FrameCallback fpProc: Pointer to the preview callback function. Specify NULL for this parameter to disable a previously installed callback function.
typedef LRESULT (*FrameCallback)(HWND hWnd, LPVIDEOHDR lpVideoHdr);

The LPVIDEOHDR is declared as follows:

typedef struct videohdr_tag {
    LPBYTE lpData;          /* Pointer to buffer. */
    DWORD  dwBufferLength;  /* Length of buffer. */
    DWORD  dwBytesUsed;     /* Bytes actually used. */
    DWORD  dwTimeCaptured;  /* Timefrom start of stream. */
    DWORD  dwUser;
    DWORD  dwFlags;         /* Flags. */
    DWORD  dwReserved[4];
}
#define VHDR_DONE       0x00000001
#define VHDR_PREPARED   0x00000002
#define VHDR_INQUEUE    0x00000004
#define VHDR_KEYFRAME   0x00000008

Once the frame callback procedure is associated to a capture window, we are all set to begin tracking.

Color Space

Before we start processing frames, it is important to understand the different representations for color spaces used in digitized video. There are many color spaces to choose from, and each of them has its own strengths and limitations. Choosing the right color space for a specific application simplifies computation significantly.

The feature that we will be looking at for this demo application is brightness, and we will track objects based on their brightness. A very natural approach is to make sure that the color space that we are dealing with has a brightness component. YUV is one color space that has this very component that we are seeking for. However, YUV is not necessarily one of the input formats that is available from the web camera. Therefore, a conversion is required from the typical RGB24 input format to YUV.

The relationship between RGB and YUV can be expressed simply as the following set of linear equations.

[ Y ]   [  0.257  0.504  0.098  0.063 ][ R ]
[ U ] = [ -0.148 -0.291  0.439  0.500 ][ G ]
[ V ]   [  0.439 -0.368 -0.072  0.500 ][ B ]
[ 1 ]   [  0.000  0.000  0.000  1.000 ][ 1 ]

This matrix results from the concept of change of basis in linear algebra, where in this case, corresponds to the rotation of the color cube such that the new basis has a component with the unique property R = G = B.

Feature Tracking

Now that we have direct access to the brightness of each pixel, a simple algorithm can be used to track a bright object. The algorithm that will be introduced here is a fairly simple one, called the "rectangle algorithm". The rectangle algorithm keeps track of four points in each frame, the top most, left most, right most and bottom most points where the brightness exceeds a certain threshold value.

If you use the following code, make sure you set the input format of your web camera to RGB24.

LRESULT CChildView::FrameCallbackProc(HWND hWnd, LPVIDEOHDR lpVideoHdr)
{
    ...
    ...

    for (int i=0; i<nHeight; ++i) {
        for (int j=0; j<nWidth; ++j) {
            /* Get the appropriate index into the buffer. */
            index = 3*(i*nWidth+j);
            /* Compute the V component. */
            Y = floor(0.299*lpData[index+2] + 0.587*lpData[index+1] + 
                                          0.114*lpData[index] + 0.5);
            /* If brightness exceeds threshold value. */
            if (Y > bThreshold) {
                /* First occurence, initialize points. */
                if (init) {
                    if (pLeft.x > j) {
                        pLeft.x = j;
                        pLeft.y = i;
                    }
                    if (pRight.x < j) {
                        pRight.x = j;
                        pRight.y = i;
                    }
                    pBottom.x = j;
                    pBottom.y = i;
                }
                /* Always keep track of four corners. */
                else {
                    pTop.x = pBottom.x = pLeft.x = pRight.x = j;
                    pTop.y = pBottom.y = pLeft.y = pRight.y = i;
                    init = true;
                }
            }
        }
    }    
    
    ...
    ...

}

A rectangle can be constructed from these points, which tells us where the bright object is. The border of the rectangle is then simply replaced by a predefined color.

if (init) {
    /* Replace border pixels with predefined colour. */
    for (int i=pLeft.x; i<=pRight.x; ++i) {
        index = 3*((pTop.y)*nWidth + i);    /* Top */
        lpData[index]   = 0;   /* B */
        lpData[index+1] = 0;   /* G */
        lpData[index+2] = 255; /* R */
        index = 3*((pBottom.y)*nWidth + i); /* Bottom */
        lpData[index]   = 0;   /* B */
        lpData[index+1] = 0;   /* G */
        lpData[index+2] = 255; /* R */
    }
    for (int i=pTop.y; i<=pBottom.y; ++i) {
        index = 3*((i)*nWidth + pLeft.x);   /* Left */
        lpData[index]   = 0;   /* B */
        lpData[index+1] = 0;   /* G */
        lpData[index+2] = 255; /* R */
        index = 3*((i)*nWidth + pRight.x);  /* Right */
        lpData[index]   = 0;   /* B */
        lpData[index+1] = 0;   /* G */
        lpData[index+2] = 255; /* R */
    }
}

This algorithm obviously has a lot of weaknesses.

  • It only gives the position of the object as a whole on the screen.
  • It does not keep any information about the shape of the object.
  • It does not tell where the middle of the object is.
  • It can never track multiple objects.

An Improved Algorithm

Sample screenshot 2

This algorithm tracks objects by identifying segments that make up the object on the screen. Each segment consists of the head and the length of the segment. The object is constructed by grouping the segments together.

BYTE Y; int index;
/* -- Variables used by the new tracking algorithm. -- */
QSEG segment;
std::list<QSEG> object;

for (int i=0; i<nHeight; ++i) {
    segment.length = 0;

    for (int j=0; j<nWidth; ++j) {
        index = 3*(i*nWidth+j);
        Y = floor(0.299*lpData[index+2] + 0.587*lpData[index+1] +
          0.114*lpData[index] + 0.5);

        if (Y > bThreshold) {
            if (segment.length == 0) {
                segment.head.x = j;
                segment.head.y = i;
            }
            ++segment.length;
        }
    }

    if (segment.length) {
        object.push_back(segment);
    }
}

/* -- Draw the shape of the object with a predefined colour.  -- */
for (std::list<QSEG>::iterator i=object.begin(); i!=object.end(); ++i) {
    index = 3*((*i).head.y*nWidth + (*i).head.x);
    lpData[index]   = 255;
    lpData[index+1] = 0;
    lpData[index+2] = 255;

    index = 3*((*i).head.y*nWidth + (*i).head.x + (*i).length);
    lpData[index]   = 255;
    lpData[index+1] = 0;
    lpData[index+2] = 255;
}

This new tracking algorithm has a few extra advantages.

  • It can track multiple objects.
  • It can track the shape of the objects.
  • The number of pixels that make up the object on the screen can be easily calculated. With this piece of information and proper distance calibration, the position of the object in 3 dimensions can be determined.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
Singapore Singapore
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionHow to run the code Pin
Verghese689129-Mar-13 10:00
Verghese689129-Mar-13 10:00 
QuestionImpressive program. Can you please mail me the VB.NET 2008 source code of the same program? Pin
# include"sourav"2-Oct-10 11:35
# include"sourav"2-Oct-10 11:35 
AnswerRe: Impressive program. Can you please mail me the VB.NET 2008 source code of the same program? Pin
thatraja28-Jan-12 5:23
professionalthatraja28-Jan-12 5:23 
You can't convert the 100% you need to some changes manually.
Check this blog post .NET Code Conversion[^]
thatraja

FREE Code Conversion VB6 ASP VB.NET C# ASP.NET C++ JAVA PHP DELPHI | Nobody remains a virgin, Life screws everyone Sigh | :sigh:

GeneralGood one Pin
uvik20-Sep-09 3:01
uvik20-Sep-09 3:01 
GeneralAssertion Error Pin
crockbrol8-Dec-08 23:42
crockbrol8-Dec-08 23:42 
GeneralSrc files are lacking the resource (res) directory Pin
algorimancer15-May-08 5:27
algorimancer15-May-08 5:27 
Questionplease flowchart or algorithm Pin
joefree18-Apr-08 3:00
joefree18-Apr-08 3:00 
GeneralTracking Brighest Of Certain Colour Pin
beardedbernard21-Nov-07 23:31
beardedbernard21-Nov-07 23:31 
GeneralSign Language Recognitiom Pin
osama abu elnasr9-Oct-07 17:43
osama abu elnasr9-Oct-07 17:43 
Questionhow to get .dsw file?? [modified] Pin
Neo Andreson8-Oct-07 17:18
Neo Andreson8-Oct-07 17:18 
GeneralC# code Pin
Abu Syed Khan24-Mar-07 6:34
Abu Syed Khan24-Mar-07 6:34 
GeneralOnly thanks! Pin
tumacaco2-Mar-07 0:55
tumacaco2-Mar-07 0:55 
Generalcontrol movement Pin
Eftekhar Ali25-Feb-07 10:10
Eftekhar Ali25-Feb-07 10:10 
QuestionHow to make it work ! Pin
Just For You6-Feb-07 19:54
Just For You6-Feb-07 19:54 
AnswerRe: How to make it work ! Pin
Eftekhar Ali25-Feb-07 10:11
Eftekhar Ali25-Feb-07 10:11 
GeneralRe: How to make it work ! Pin
jung-kreidler28-Feb-07 4:17
jung-kreidler28-Feb-07 4:17 
GeneralReally Nice.. Pin
Anant wakode26-Jul-06 0:45
Anant wakode26-Jul-06 0:45 
Generalfatal error RC1015: cannot open include file 'res\Tracker.rc2'. Pin
anishchowdhri27-Jan-06 14:26
anishchowdhri27-Jan-06 14:26 
GeneralRe: fatal error RC1015: cannot open include file 'res\Tracker.rc2'. Pin
herve3d4-Feb-06 9:48
herve3d4-Feb-06 9:48 
GeneralI can only tracking Highlight Objects Pin
descartes1-Jul-05 4:25
descartes1-Jul-05 4:25 
GeneralNeed the Tracker.sln file Pin
rad.moto20-Jun-05 21:06
rad.moto20-Jun-05 21:06 
GeneralRe: Need the Tracker.sln file Pin
tumacaco2-Mar-07 0:57
tumacaco2-Mar-07 0:57 
Generalcannot open project .vcproj Pin
Ray0000002-Jun-05 22:56
Ray0000002-Jun-05 22:56 
GeneralTracking objects based on their color Pin
Member 174805713-May-05 4:09
Member 174805713-May-05 4:09 
GeneralRe: Tracking objects based on their color Pin
Anonymous1-Jun-05 22:55
Anonymous1-Jun-05 22:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.