13,002,169 members (80,406 online)
Add your own
alternative version

#### Stats

51.7K views
4.9K downloads
82 bookmarked
Posted 11 Aug 2009

# Computer Vision Applications with C# - Part IV

, 11 Aug 2009
 Rate this:
Please Sign up or sign in to vote.
KMeans Clustering

## Introduction

This article introduces Pattern Recognition in computer vision. I am going to discuss a clustering algorithm called KMeans Clustering. Just a reminder, that Pattern Recognition is largely based on statistical techniques and so one should not be surprised to find, let’s say data mining algorithms, used in computer vision. In fact, there is an implementation of KMeans in SQL Analysis Services.

So what is KMeans? It is an unsupervised learning technique. It is unsupervised because when it starts running, it decides when to stop based on some criteria. And it is a learning technique because it learns about data. The aim of KMeans is to discover clusters of data within data. So if you run KMeans on some data, then you would end with “K” clusters of data. These clusters are formed based on some common attribute of data. We have to specify “K” upfront and give the algorithm an estimate of the centroids of these “K” clusters. Just note that the ability to provide these two parameters is not always possible and this is a drawback of this algorithm – and in this case one needs to use a variation of KMeans instead of this basic version.

## Image Segmentation

So what does KMeans Clustering have to do with images? Well, it can find clusters of different colours for you resulting in image segmentation. The data in image is the pixel colours. We need to provide the algorithm the number of clusters. This is relatively easy as we can guess the number of distinct colours by looking at the image. This does not have to be exact but depending on the number of clusters, we may or may not get good segmentation. In the above image, you can see that I have used 3 clusters which means the image is segmented in 3 distinct colours. Okay now based on the number of clusters, the algorithm is going to group similar colours together till it cannot find any colours to move around the clusters. So let’s have a look at the algorithm next.

## KMeans Algorithm

Here is how KMeans Algorithm learns data clusters:

1. Set number of clusters.
2. Set or randomly set the centroids of these clusters.
3. For all data:
1. Find the distance between centroids of the clusters and the data.
2. Move the data to the cluster that has the least distance to it.
4. Calculate the new centroids of the clusters now that its data may have moved.
5. If the new centroids are different from old centroids, then go to step 3, otherwise stop.

And this is pretty much it! So let’s understand this: we know what steps 1 & 2 mean. For step 2, I get the top X colours from the image and use its values as centroid. In other words, if I said I want 2 clusters, then if top 2 colours (colours with most pixels) from the image are RGB:100,200,130 & 40,70,140, then these values become my starting 2 centroids.

Step 3a is where we would read each pixel in the image and find its distance from the 2 centroids. Now what is that distance? I have used Euclidean Distance but you could use any. But remember that type of distance used will affect the segmentation results. This is because using this distance, we measure how similar the current pixel colour is to the cluster centroid. To emphasis this point, I have used RGB and HSV colour models. Using the same number of clusters for RGB & HSV will produce different segmentation results.

Step 3b moves pixels to closest clusters. So if for instance, pixel 110,215,110 will be closest based on Euclidean Distance to first cluster centroid and pixel 50,80,160 will be closest to the second.

Step 4 is where we find new centroids for the clusters. In the current example, the new centroids are: 100+110/2,200+215/2,130+110/2 and 40+50/2,70+80/2,140+160/2. Yes the new centroids are found using average.

Step 5 is convergence check. If the new centroids are the same as the old centroids, then that means the data within the centroids have not moved and the clusters have stabilized!

## Using the Code

The code is pretty straight forward and should be easy to use:

• Step 1: Load an image – “Load Image” button.
• Step 2: Set number of clusters.
• Step 3: Select a colour model.
• Step 4: Click “Start KMeans” button to run the algorithm.

Change the number of clusters and colour models to see the effects on segmentation.

## Points of Interest

The values of the final colours in segmented image come from the cluster centroids and may not necessarily reflect the colours in the original image. All you will see are “K” different colours in the segmented image.

## History

• 11th August, 2009: Version 1.0

## License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

## About the Author

 Australia
I have been in the IT industry since April 1996. My main expertise is in Microsoft space.

Coming from engineering background, any application of programming to engineering and related fields easily excites me. I like to use OO and design patterns and find them very useful.

I have been an avid reader of CodeProject. I decided it was time to make a commitment to make my contribution to the community - so here I am.

My Website: http://www.puresolutions-online.com

## Comments and Discussions

 First Prev Next
 clustering into 2 colour, ground (terrain) and any other besides ground Member 1128785920-Jun-15 18:00 Member 11287859 20-Jun-15 18:00
 ask?? amin erfandy22-Sep-11 13:52 amin erfandy 22-Sep-11 13:52
 Clustering techniques Trevor Misfeldt7-Oct-09 8:10 Trevor Misfeldt 7-Oct-09 8:10
 I'm curious if you took at look at using non-negative matrix factorization (NMF) clustering? - Trevor -- CenterSpace Software www.centerspace.net
 Re: Clustering techniques Arif_Khan7-Oct-09 13:37 Arif_Khan 7-Oct-09 13:37
 Patent issues 3Drti20-Aug-09 9:22 3Drti 20-Aug-09 9:22
 Re: Patent issues Dmitry N. Medvedev20-Aug-09 11:17 Dmitry N. Medvedev 20-Aug-09 11:17
 Re: Patent issues [modified] Arif_Khan20-Aug-09 13:25 Arif_Khan 20-Aug-09 13:25
 Last Visit: 31-Dec-99 18:00     Last Update: 26-Jun-17 11:20 Refresh 1

General    News    Suggestion    Question    Bug    Answer    Joke    Praise    Rant    Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.170624.1 | Last Updated 11 Aug 2009
Article Copyright 2009 by Arif_Khan
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid