Click here to Skip to main content
Click here to Skip to main content

Computer Vision Applications with C# - Part I

By , 15 Apr 2009
 

article.png

Introduction

Computer Vision and programming go hand in hand. One needs to use programming to materialize the theory so it can be applied to real world problems. Computer Vision is an exciting field where we try to make sense of images. These images could be static or could be retrieved from videos. Making sense could be things like tracking an object, modeling the background, pattern recognition etc. This article is the first of a series of articles that will using C# to educate users in Computer Vision. Being the first article, I intend to introduce some basic concepts used in Computer Vision. I will refer back to these concepts in upcoming articles where I will implement a few state-of-the-art algorithms in Computer Vision, covering areas such as object tracking, background modeling, patter recognition etc.

Image Understanding

An image is composed of many dots called Pixels (Picture Elements). More the pixels, higher the resolution of the image. When an image is grabbed by the camera, it is often in RGB (Red Green Blue) format. RGB is one of many colour spaces used in Computer Vision. Other colour spaces include HSV, Lab, XYZ, YIQ etc. RGB is an additive colour space where we get different colours by mixing red, green, and blue values. In a 24-bit RGB image, the individual values of R, G, and B components range from 0 - 255. A 24-bit RGB image can represent 224 different colours, i.e., 16 million. OK, going back to the image, we humans see objects in there, where as the computer sees pixels having RGB values ranging from 0 - 255. So, there is an obvious need to build some kind of intelligence into computers so we can make them make sense of images.

If you want to pursue a career in Computer Vision, you have to understand one thing: Statistics & Probability! Normally, statistics would be used in creating a model, and probability would be used in making sense of the model. So, moving forward, I will try to explain some fundamentals required to understand an image so Computer Vision techniques can be applied to it.

Image Attributes

The very first step in modeling an image is to pick an attribute to be modeled. It does not have to be a single attribute - you would normally use a combination of attributes to make your algorithm robust. Some of the primary attributes include edges, colour etc. The attributes are chosen such that they are unique. But in reality, that is not the case. For example, if using colour, many images would share the same colour distribution. So, we need to find an attribute or a combination of attributes that provides a greater degree of uniqueness.

Attribute Modeling with Histogram

Once an attribute is chosen, the next step is to model it. There are many models available in Computer Vision - each with its pros and cons. But in this article, I will concentrate on histogram. The reason for selecting histogram is because it is very popular in Computer Vision, plus it forms the foundation for articles coming up. From elementary statistics, we know that a histogram is nothing more than a frequency distribution. Hence, a colour histogram is the frequency of different colours in the image.

Using normalisation, we can add scale invariance to a histogram. What that means is that the same object with different scales will have identical histograms. Normalisation is achieved by dividing the value of each bin by the total value of the bins.

To create a colour histogram, we first need to decide on the number of bins of the histogram. Generally speaking, the more bins you have, the more discriminatory power you get. But then, the flip side is that you need more computational resources. The second decision you need to make is how to implement this colour histogram. Remember that you normally would have three colour components, such as Red, Green, and Blue. A popular approach is either to use a 3D array or a single array. Using a 3D array is straightforward, but using a single array for three components require some thought. In the end, it is a matter of liking - I prefer the latter approach.

For a 16x16x16 bin histogram, we have 256/16 = 16 colour components per bin. So, we define a 3D array something like this:

// Declare a 3 dimensional histogram
[,,] float histogram = new float [16, 16, 16];

As an example, if the pixel's RGB value is 13, 232, and 211, then this means you are dealing with RGB bins 0, 14, and 13. These bin numbers are obtained by dividing the colour values by the number of bins - 16, in our case. There you have to increment the histogram [0, 14, 15] by 1. If we do that for all the pixels in an image, we would end up with the colour histogram of the image which tells us about the colour distribution in the image.

For a 16x16x16 bin histogram, we declare a 1D array like this:

// Declare a 1 dimensional histogram
[] float histogram = new float [16 * 16 * 16];

In order to use a 1D array, we need to define an indexing scheme so we can add and retrieve the values of the bins. The indexing method is given in the code below:

private int GetSingleBinIndex(int binCount1, int binCount2, int binCount3, BGRA* pixel)
{
    int idx = 0;
    
    //find the index
    int i1 = GetBinIndex(binCount1, (float)pixel->red, 255);
    int i2 = GetBinIndex(binCount2, (float)pixel->green, 255);
    int i3 = GetBinIndex(binCount3, (float)pixel->blue, 255);
    idx = i1 + i2 * binCount1 + i3 * binCount1 * binCount2;

    return idx;
}

Again, if the pixel's RGB value is 13, 232, and 211, then this means you are dealing with RGB bins 0, 14, and 13. This points to an index of 0 + 14 x 16 + 13 x 16 x 16 = 3552 in a 1D array. To create a histogram, you will increment the value of this bin by 1.

Model Matching

Once we have represented an image attribute as a histogram, we often need to perform recognition. So, we can have a source histogram and a candidate histogram, and match the histogram to see how closely the candidate object resembles the source object. There are many techniques available, such as Bhattacharyya Coefficient, Earth Movers Distance, Chi Squared, Euclidean Distance etc. In this article, I will describe the Bhattacharyya Coefficient. You can implement your own matching technique bearing in mind that each matching technique has its pros and cons.

The Bhattacharyya Coefficient works on normalised histograms with an identical number of bins. Given two histograms with p and q, the Bhattacharyya Coefficient is given as:

bc.png

If you are like me and get discouraged by mathematical equations, then, don't worry: I have a worked example for you! Considering the following two histograms, the calculation of Bhattacharyya Coefficient is shown below:

h1.png

h2.png

bc_calc.png

As we can see, it requires us to multiply the bins. Furthermore, we can see that for identical histograms, the coefficient will be 1. The values of Bhattacharyya Coefficient ranges from 0 to 1, i.e., from least similar to exact match.

Using the Code

  • Step 1: Select a histogram size - the default is 4x4x4.
  • Step 2: Select an image from the list and click the top "<<" button to see its histogram.
  • Step 3: Select an image from the list and click the bottom "<<" button to see its histogram.
  • Step 4: Click the "Find Bhattacharyya Coefficient" button to see the coefficient. For the same images, it will be 1.

Points of Interest

Native image processing in .NET is slow! Using a Bitmap object with GetPixel() and SetPixel() methods is not the way to do image processing in .NET. We need to access the pixel data by using unsafe processing. I have used the code by Eric Gunnerson.

History

  • Version 1.0.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Arif_Khan
Australia Australia
Member
I have been in the IT industry since April 1996. My main expertise is in Microsoft space.
 
Coming from engineering background, any application of programming to engineering and related fields easily excites me. I like to use OO and design patterns and find them very useful.
 
I have been an avid reader of CodeProject. I decided it was time to make a commitment to make my contribution to the community - so here I am.
 
My Website: http://www.puresolutions-online.com

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 5memberTheophilus omoregbee11 Dec '12 - 8:13 
nice
QuestionAre u willing to take up a project?memberzoro.mezaa9 Sep '12 - 4:59 
Please revert back with confirmation to freelancer.mezaa@gmail.com.
I need an open cv programmer for few of my projects.
Regards
AnswerRe: Are u willing to take up a project?memberArif_Khan9 Sep '12 - 13:00 
Thanks for the asking but I am busy with other projects. Besides, I am not OpenCV developer as such.
QuestionNeed clarificationmemberPhan Dung12 Aug '11 - 18:42 
From your example program:
for (int y = 0; y < size.Y; y++)
{
    pPixel = fastBitmap[0, y];
    for (int x = 0; x < size.X; x++)
    {
        //get the bin index for the current pixel colour
        idx = GetSingleBinIndex(numBinsCh1, numBinsCh2, numBinsCh3, pPixel);
        hist.Data[idx] += 1;
        total += 1;
 
        //increment the pointer
        pPixel++;
    }
}
I don't understand why pPixel = fastBitmap[0, y]? it should be pPixel = fastBitmap[x, y], right?
AnswerRe: Need clarificationmemberPhan Dung12 Aug '11 - 18:43 
Hehe, sorry, got the last statement, pPixel++
GeneralA suggestion:memberollydbg233 Aug '09 - 3:27 
I found these sentences:
 
For a 16x16x16 bin histogram, we have 256/16 = 16 colour components per bin. So, we define a 3D array something like this:
 
I think people will get confused if they see the word components. Even though I can understand its meaning. I think you should at least explanation a bit more.
 
In the paragraph before that, the "component" has another meaning:
 
the individual values of R, G, and B components range from 0 - 255
 
So, I suggest you can use "color value" instead.
GeneralRe: A suggestion:memberArif_Khan3 Aug '09 - 14:01 
Thanks! I will leave the article text as is because your comments should clarify any issues.
GeneralJava versionmemberrobev3314 May '09 - 9:47 
I've tried writing a Java version of this(as I know it and not C#) but my histograms (or more specifically, their data) are nothing like yours.
I think it had to do with how we are both accessing the pixel information.
 
You are doing something i don't really understand (return (BGRA*)(pBase + y * width + x * sizeof(BGRA));)and in your getBinIndex method you are doing (int idx = (int)(colourValue * (float)binCount / maxValue);) I'm guessing to convert the colourValue to a number 0-255 then dividing it by the number of bins? something like that.
 
Don't know if you know Java, but I found an interesting way to get pixel information from this website: http://www.lac.inpe.br/~rafael.santos/JIPCookbook/1500-pixelaccess.jsp[^]
 
This gives you values 0-255 for R G and B, I don't need to convert. This difference is creating different histograms, I assume mine are wrong as my Bhattacharyya Coefficients are erroneous (like 0.78 for two completely different pictures, where yours are 0.34).
 
Wonder if you can help =)
GeneralRe: Java versionmemberArif_Khan14 May '09 - 16:29 
Sorry buddy, Java is not my expertise but will give it a go. First of all to create a histogram, you decide upfront the number of bins a histogram will have. Suppose we decide for 16 bins and we have a gray image with each pixel having a value between 0 - 255. This means we have (Number of Possible Colours/Total Number of Bins) i.e. 256/16=16 colours per bin. So pixels with values 0 - 15 belong to first bin, 16 - 31 belong to the second bin and so on. That is what the GetBinIndex returns i.e. the index of the bin. The other bit of code you've asked about is getting a pointer to pixel at location "x" and "y" - something .Net specific.
 
Now the Java code you pointed me to is also getting the pixel value and passing it to isWhite function to check if the pixel is white or not. What you need to do is similar to what I have done. Define an float array for histogram with length equal to the product of bins (binCount1, binCount2, binCount3) for each colour component in the pixel e.g. float[] hist = new float[4 * 4 * 4]. Read the pixel and pass the value for each colour component to GetBinIndex function to return the bin indices idx1, idx2 and idx3. Now get a single bin index: idx = idx1 + idx2 * binCount1 + idx3 * binCount1 * binCount2. Increment the histogram array by 1 at idx: hist[idx] += 1. Don't forget to normalise the histogram once you have iterated through the whole image.
 
For exactly the same image, you should get a Bhattacharyya Coefficient of 1 - if you are not getting this then something is wrong with your code. Also don't forget to double check your code for calculating Bhattacharrya Coefficient. Just note that the actual float variable might have some decimal point error e.g. you may get 0.99999999 or 1.000002 but it is correct. I hope that helps otherwise let me know!
GeneralRe: Java version [modified]memberrobev3315 May '09 - 2:37 
Yup I'm doing all that and it is not working right (at least in my opinion) >_> However the Bhattacharyya Coefficient IS 1 when I have two of the same images. I'll show you screenshots of my problem:
http://img40.imageshack.us/img40/5210/50518007.jpg[^]
 
(my coefficient is in the title bar)
 

I'm doing what you are describing so I don't know what the problem is and I guess you don't either lol. I'll keep looking =)
 

Here's how I get the individal indexes (since they are given as 0-255 and I don't need to convert them to that)
private int getBinIndex(int binCount, int colourValue) {
		int idx = colourValue / binCount;
		if (idx >= binCount)
			idx = binCount - 1;
		
		return idx;
	}
 
Here's what you are doing (I am assuming because you are converting them to 0-255
 
private int GetBinIndex(int binCount, float colourValue, float maxValue)
        {
            int idx = (int)(colourValue * (float)binCount / maxValue);
            if (idx >= binCount)
                idx = binCount - 1;
 
            return idx;
        }
 
Now just for fun I tried what you did too but that made it worse, so I don't think I need to do that conversion.
 
modified on Friday, May 15, 2009 9:10 AM

GeneralGood articlememberShakeel Mumtaz23 Apr '09 - 21:50 
i found the article easy to understand with some basic mathematic background of anyone.i also did lot of work in this field and hunger to learn more.
 
so start the article as well, i can't wait now. can you give me your contact detail ? if possible.
 
Shakeel Mumtaz
Punjab University College Of Information Technology,Lahore,Pakistan

GeneralRe: Good articlememberArif_Khan24 Apr '09 - 14:47 
I tried to make it easy to understand and will strive to do so in coming articles.
Generalany referencing books or papersmemberhsmcc21 Apr '09 - 2:56 
Nice article.
 
I think that the theoritcal aspects of Computer Vision is not that popular among main street programers/developers. Could you provide some reference materials you used?
 
Thnaks.
GeneralRe: any referencing books or papersmemberArif_Khan21 Apr '09 - 10:46 
Thanks! You are right about the theory bit. I would take it a step further and say it is sometimes hard for students to understand the theory as well. The reference material I have used is my experience. I have done research in computer vision for 2 years and know how difficult it is to find easy-to-understand material. This is the whole point of writing this series of articles: to help programmers & student bridge the gap between theory and applications. To give you a heads up, I will be covering object tracking, background modelling & pattern recognition in coming articles.
GeneralRe: any referencing books or papersmemberhsmcc22 Apr '09 - 2:39 
waiting for them.
GeneralC# Codememberjoutlaw15 Apr '09 - 17:03 
Excellent article and I look forward to rest in the series. One thing I would point out is that you are using pointers in your C# method declarations. Do you plan to update you code samples with the appropriate unsafe and fixed keywords to support your samples?
GeneralRe: C# CodememberArif_Khan15 Apr '09 - 18:31 
Thanks mate. The project properties is set to "Allow unsafe code" so you should not have any compilation errors. The image processing code came from Eric Gunnerson as mentioned in the article. The whole UnsafeBitmap class is marked "unsafe". Once the class is created, LockBits method is called to lock the whole bitmap in the memory. Then I loop through the image to do my processing and at the end, UnLockBitmap is called to release the lock. After that the bitmap is marked for garbage collection. So there is no need to use "fixed" keyword to pin the bitmap.
GeneralA small mistakememberGtrwld15 Apr '09 - 1:01 
You wrote, In an 8-bit RGB image, the individual values of R, G, and B components range from 0 - 255., it's actually a 24-bit RGB image.
GeneralRe: A small mistakememberArif_Khan15 Apr '09 - 13:22 
Thanks for picking it up - I have modified it.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130516.1 | Last Updated 15 Apr 2009
Article Copyright 2009 by Arif_Khan
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid