Implementing K-Means Image Segmentation Algorithm

Arthur V. Ratz

5.00/5 (21 votes)

Aug 27, 2017

CPOL

24 min read

48180

3278

How to implement k-means clustering algorithm to perform graphical raster image segmentation

Introduction

In this article, we’ll particularly discuss about the implementation of k-means clustering algorithm to perform raster image segmentation. We’ll demonstrate a raster image segmentation process by developing a code in C# that implements k-means clustering algorithm adaptation to perform an image segmentation.

What’s Image Segmentation

This article demonstrates the development of code in C# that implements one of the most basic variants of the classical k-means clustering algorithm that can be easily used to perform a simple graphical raster image segmentation. The image segmentation basically refers to the process of an image vectorized color quantization in which the color palette of an image is reduced to a certain finite quantity of colors. During the following process, we actually perform the partitioning of the entire image into multiple segments (i.e., super-pixels), making it easier to analyze the following image by using various image processing algorithms. While performing the image segmentation, we’re actually aiming to locate objects within the original image as well as to find their boundaries that define the areas of objects being located. The main idea of performing the following process is to find those areas of pixels that share the same color hue parameter value. Thus, those sets of pixels are grouped into multiple labeled segments. In this case, those multiple segments are constituents of the entire image.

Normally, the image segmentation can be efficiently used for many applications such as either the raster image compression or vectorization:

Image Data Compression: The image segmentation is one of the essential phases of many existing raster image lossy compression algorithms such as BPG, JPEG-2000, S3TC, PDF, DjVu, etc. In this case, the segmentation allows us to significantly increase the compress ratio as the result of performing the partitioning of the entire image into the number of clusters, containing pixels having the same color hue. By performing the segmentation, we actually replace the color hue of each pixel with the color of a centroid pixel selected within a single cluster, and, thus, increase the number of pixels of the same color. This normally results in obtaining an image consisting of much fewer colors compared to the original image being compressed. The lossy compression algorithm basically performs a lookup for the sets of pixels having the same color and finally encodes the less data, which, in turn, results in a better image data compression ratio, and, obviously reduced image file length.
Vectorization: From another respect, the image segmentation can also be successfully applied to the process of raster image vectorization. While performing an image vectorization, the segmentation can be successfully used to find the boundaries of particular fragments that constitute the entire image. That, in turn, makes it possible to easily locate the outlines of the entire image right from the very beginning of vectorization process. Further, those outlines are subsequently vectorized and represented by the number of vector linear equations. At the same time, the segmentation allows to eliminate those redundant and unnecessary image fragments, the data on which might not be assigned to specific data structures, that allows to either slipstream the vectorization process, or to reduce the potential size of the output vectorized image file, making it more lightweight.

Fig.1a and Fig.1b illustrate the results of image segmentation performed as a part of either the image data compression or vectorization respectively:

Specifically, the raster image segmentation can be performed by using a variety of algorithms such as either thresholding, gradient-based, histogram-based, or compression-based, etc. Among those algorithms, k-means clustering actually turns out to be the most obvious and reliable algorithm. Actually, k-means clustering algorithm is one of the most fundamental algorithms of AI machine learning and data mining as well as the regression analysis. In most cases, it allows us to achieve a high-quality result of image segmentation avoiding the actual distortion during the image processing.

As we’ve already discussed, this is typically done by partitioning the number of observations learned from the image data into the corresponding number of clusters with the nearest mean exhibited as a super-pixel, that serves as a color prototype of the entire cluster. In addition, let’s recall that k-means clustering belongs to the class of NP-hard algorithms.

However, there’s the number of existing metaheuristic search algorithms that are very quick to converge as the result of an optimum cluster image decomposition, that, in turn, allows us to efficiently achieve an entire image segmentation, eliminating the case in which the image is segmented partially.

Background

In this section of the following article, we’ll make a large focus on the variant of the k-means clustering algorithm that allows us to perform a raster image segmentation, rather than the other data partitioning tasks, such as either producing recommendations or sum of squared error (SSE) computation. The conventional k-means clustering algorithm was already thoroughly discussed in one of my previous articles published. That’s actually why, in this article, we’ll discuss particularly about the k-means clustering algorithm variation that basically dealt solely with raster image segmentation.

Image Segmentation Algorithm

Basically, the image segmentation algorithm being discussed is very simple and can be formulated as follows:

Create an initial cluster containing an original image and a set of centroid pixels randomly selected from the image. Append the initial cluster built to the array of clusters.
Retrieve the current cluster from the array and iterate through the set of those super-pixels (i.e., centroids).
For each super-pixel, compute the actual distance to each pixel in the current image. To do this, we’ll normally use the variant of Euclidian distance formula that allows us to find the distance between two 3D-vectors of colors (R; G; B) of the either super-pixel or the current pixel in the given image respectively.
Perform a linear search to find those pixels which value of distance to the current super-pixel (i.e., centroid) does not exceed a specific boundary.
Build a new cluster based on the new image containing all those pixels selected at the previous step and the current value of super-pixel. In this case, the super-pixel will serve as a centroid of a newly built cluster. Also, we need to substitute the color of each pixel with the centroid’s color.
Compute the value of the nearest mean of all super-pixels in the current cluster by using the center of mass formula to find the coordinates of a particular central point for the set of super-pixels of the current cluster.
Perform a check if the coordinates of that central point obtained in the previous step are not equal to the coordinates of the super-pixel in the newly built cluster (the central point has not been moved). If not, append the new cluster to the array of clusters.
Perform a check if the current cluster is the final cluster in the array of clusters. If so, go to step 9, otherwise return and proceed with step 2.
Iterate through the array of clusters and merge each particular cluster image into the entire image being segmented.
Save the segmented image to a file.

Initialization

The very first essential step of the k-means image segmentation algorithm is the initialization phase. During this phase, we basically create an initial cluster from the source image and the array of randomly selected pixels. In this case, the process of generating an initial source array of random pixels having particular coordinates is a special interest that we’re about to thoroughly discuss in this section.

Under normal conditions, we might generate a set of distinct pixels which random coordinates (X; Y) are randomly selected during each of W * H loop iterations, where W – the width and H – is the height of the source image respectively. For that purpose, we need to implement a loop and perform the number of iterations. During each iteration, we normally generate the values of X <- RAND[0; W] and Y <- [0;H] separately, and create a point with specific coordinates. Further, we’re performing a check if the following point being generated already exists in the array of pixels.

However, the using of method mentioned above normally leads to a so-called “poor” clustering and causes the segmentation process to be merely unpredicted. That’s actually why, to provide the desired quality of image segmentation, we need to modify the following initial values generation method by applying the specific condition to the process of random pixels selection.

Normally, we have at least two main factors that have a large impact on the process of generating of the initial dataset for the image segmentation algorithm. Apparently, the first factor is the actual distance between points randomly selected. The typically small value of distance causes the increase of probability to select two pixels having the same color. That’s actually why the value of distance parameter must be optimal.

Let’s recall beforehand that in this particular case the actual distance and the color offset are the parameters which values are experimentally taken. For example, the value of actual distance can vary from 100 to approximately the half of the image width or height (e.g. W / 2 or H / 2). As well, to achieve a high-quality image segmentation, we’re might set the value of color offset parameter, suppose, to 50, to make sure that we select pixels of absolutely different colors.

For that purpose, we generate a new random pixel coordinates and perform a check if none of the previously generated points have the distance to the current point less than the value of the specific parameter mentioned above. To do this, we iterate through the array of pixels coordinates previously generate and for each existing pixel, we’re performing a check if its distance to the newly generated point is not greater or equal to the value of specified distance parameters. For that purpose, we normally use a Euclidian distance formula. Similarly, at this point, we’re performing a check if the value of color of the randomly selected pixel exactly meets the specified criteria. If both conditions are met, we append the newly generated point to the initial array of pixels.

During the following phase, since we’ve obtained the initial set of pixels that will serve as centroids of a newly built initial cluster, we also need to associate an original source image with the initial array being generated. As we already might have noticed, in this case, the image is represented as the matrix of pixels, in which each element of this matrix is an actual vector of pixel’s color (R; G; B). The indices of either matrix row X and column Y are actually the coordinates of the following pixel.

K-Means Clustering Phase

To perform the actual image segmentation, we normally create an array containing all the clusters being generated during the initialization phase. Also, we implement a loop to iterate through the following array, and as you might already know from the algorithm’s step-by-step explanation, during each iteration of the following loop, we normally step through the array of super-pixels (i.e., centroids being previously selected for the current cluster). For each super-pixel, we’re performing a linear search to find the set of those image pixels having a particular color distance that corresponds to the boundary condition being applied. Further, we create a new image from the selected pixels for the newly built cluster. The color of each pixel is replaced with the color of centroid super-pixel. The new image in this case will contain the pixels of the same colors that outline a specific area of the original image. Finally, the newly built cluster is added to the array of clusters.

After that, we compute the coordinates of the central point by finding the center of mass for the array of super-pixels in the parent cluster and perform a check if the centroid super-pixel of the newly built cluster has been moved relative to the central point of the parent cluster. If so, we’re adding the newly built cluster to the array of clusters, otherwise just omit it based on the hypothesis that the newly built and the parent clusters are identical.

Normally, we repeat with the process discussed above for each particular cluster in the target array of clusters until there’s no more clusters to be processed.

Finally, to obtain the ready image being segmented, we need to merge images from all those clusters built into an entire image. Obviously, that, an image associated with each one of those clusters will contain an area of the original image having a particular color. The guidelines on how to accumulatively merge all those images are discussed later on in this article.

Computing Euclidian Distance

As we’ve already discussed, the main aspect of using k-mean clustering algorithm to perform an image segmentation, is the Euclidian distance formula. In this case, we basically dealt with two main variants of the following formula for either computing the actual distance between two pixels, or to determine the similarity of two 3D-vectors colors.

The very general representation of the Euclidian distance formula is as follows:

, where L - is the value of Euclidian distance, x, y - the two vectors between which we're about to compute the value of actual distance, n - the number of dimensions (i.e., components of these vectors) (n = 2).

Since, in this case, we basically deal with the either two or three dimensional vectors only, the simplified variant of the following formula is used.

To compute the actual distance between two pixels with specific coordinates, we'll use the following variation of the Euclidian distance formula:

, where L - is the value of Euclidian distance (x1;x2) and (y1;y2) - the values of (X;Y) coordinates in the Cartesian plane of two pixels x and y respectively.

Also, to obtain the magnitude of similarity value between two 3D-vectors of colors (R; G; B), we'll use the same 3D-variant of this formula:

, where S - the value of similarity magnitude between two vectors of colors which components are (r1;g1;b1) and (r2;g2;b2) respectively.

Optionally, to achieve more sophisticated results of image segmentation, to measure the distance between two pixels or similarity of colors, we can also use the Pearson correlation formula or the formula that allows to compute the value of angle between two vectors being selected.

Using Ready Clusters to Build a Segmented Image

The merge procedure is the final essential step in the segmented image creation process. As we've already discussed, as the result of performing k-means clustering, we normally obtain the array of clusters, in which each cluster contains a specific image of a particular segmented region. At this point, our goal is to assemble all those segmented regions into an entire output image.

To do this, we'll use a trivial linear algorithm that deals with the matrices of pixels, copying each valid pixel from the current matrix into the target matrix of pixel, previously allocated. In particular, this algorithm has the following formulation.

According to the following algorithm, we normally need to iterate through the array of clusters and for an image within each cluster, we must retrieve the coordinates of those pixels that belong to the current segmented region. For that purpose, we iterate through the matrix of pixels and for each pixel performing a check if the following pixel has the color that is not equal to (0xFFFFFF) - the "white" color. If so, we're assigning the color data of the current pixel to a corresponding pixel in the target matrix of pixels having the same coordinates location. We normally proceed with the following process until we've stepped through all clusters produced during the k-mean clustering phase. As the result of merging all those images for each particular cluster, we're obtaining a ready segmented image.

Using the Code

In this section, we'll discuss about on how to develop a code in C# that implements the algorithm briefly discussed above. Before we begin, let's focus on specifically the data structures used to store the data on multiple different objects such as either pixels, images or clusters. Obviously that, the KMCPoint<T> template class is the simplest data structure used to store the data on pixel coordinates and its color. Here's an implementation of the class in C#:

    class KMCPoint<t>
    {
        // KMCPoint constructor
        public KMCPoint(T X, T Y, Color Clr) { this.X = X; this.Y = Y; this.Clr = Clr; }
        // X coordinate property
        public T X { get { return m_X; } set { m_X = value; } }
        // Y coordinate property
        public T Y { get { return m_Y; } set { m_Y = value; } }
        // Colorref property
        public Color Clr { get { return m_Color; } set { m_Color = value; } }

        private T m_X; // X coord
        private T m_Y; // Y coord
        private Color m_Color; // Colorref (R;G;B)
    }

As you can see from the code above, the following class contains two field private variables of either m_X and m_Y that are used to hold the data on the either X- or Y - coordinates of a certain pixel. Also, the following class has another private variable m_Color that is actually an object of Color generic class used to store the data on the actual color of pixel in (R;G;B) format. Since all these variables are private, we also additionally implement three corresponding properties X, Y and Clr to access and modify the values of those private variables. Normally, the following class also contains a constructor that accepts three parameters as passed as an argument. The values of these parameters are assigned to the interval private variables via the corresponding public properties when the object of KMCPoint<T> class is constructed.

Before we begin the discussion of code, let's remind that in the following code being developed to instantiate and process bitmaps, we'll use a modified variant of generic Bitmap class that allows to manipulate pixels stored in the previously allocated array rather than using conventional Bitmap.GetPixel(...) or Bitmap.SetPixel(...) methods.

This normally allows to hopefully :) increase the performance of image segmentation process when dealt with images of typically large dimensions.

    class LockedBitmap
    {
        // LockedBitmap class constructors
        public LockedBitmap(string filename)
        {
            // If the m_Bitmap internal object is unitilialized
            if (m_Bitmap == null)
            {
                // Initialize the bitmap object and copy source bitmap into
                // the object being constructed
                m_Bitmap = new Bitmap(filename);
                // Initilaize the rectange area object having size equal to 
                // size of the bitmap
                m_rect = new Rectangle(new Point(0, 0), m_Bitmap.Size);
            }
        }
        public LockedBitmap(Int32 Width, Int32 Height)
        {
            // If the m_Bitmap internal object is unitilialized
            if (m_Bitmap == null)
            {
                // Initialize the bitmap object and copy source bitmap into
                // the object being constructed
                m_Bitmap = new Bitmap(Width, Height);
                // Initilaize the rectange area object having size equal to 
                // size of the bitmap
                m_rect = new Rectangle(new Point(0, 0), m_Bitmap.Size);
            }
        }
        public LockedBitmap(Bitmap bitmap)
        {
            // If the m_Bitmap internal object is unitilialized
            if (m_Bitmap == null)
            {
                // Initialize the bitmap object and copy source bitmap into
                // the object being constructed
                m_Bitmap = new Bitmap(bitmap);
                // Initialize the rectangle area object having size equal to 
                // size of the bitmap
                m_rect = new Rectangle(new Point(0, 0), m_Bitmap.Size);
            }
        }
        
        // Assignment operator overloading for LockedBitmap class object
        public static implicit operator LockedBitmap(Bitmap bitmap)
        {
            // Construct the LockedBitmap class object from the value of 
            // generic Bitmap object
            return new LockedBitmap(bitmap);
        }
        
        // Call this method to lock the bitmap pixels buffer
        public void LockBits()
        {
            // Lock bitmap's bits and obtain the BitmapData object
            m_BitmapInfo = m_Bitmap.LockBits(m_rect, System.Drawing.Imaging.
                ImageLockMode.ReadWrite, m_Bitmap.PixelFormat);

            // Obtain the value of pointer of the first line of pixels
            m_BitmapPtr = m_BitmapInfo.Scan0;
            // Allocating the array of pixels having size equal to 
            // BitmapInfo.Stride x Bitmap.Height
            m_Pixels = new byte[Math.Abs(m_BitmapInfo.Stride) * m_Bitmap.Height];
            // Use marshal copy routines to copy the entire buffer of pixels
            // to the previously allocated array m_Pixels
            System.Runtime.InteropServices.Marshal.Copy(m_BitmapPtr, m_Pixels,
                0, Math.Abs(m_BitmapInfo.Stride) * m_Bitmap.Height);
        }
        
        // Call this method to unlock and reflect changes to the pixels buffer
        public void UnlockBits()
        {
            // Obtain the value of pointer of the first line of pixels
            m_BitmapPtr = m_BitmapInfo.Scan0;
            // Use marshal copy routines to copy all elements 
            // in the array m_Pixels back to the pixels buffer
            System.Runtime.InteropServices.Marshal.Copy(m_Pixels, 0,
                m_BitmapPtr, Math.Abs(m_BitmapInfo.Stride) * m_Bitmap.Height);

            // Unlock the bits.
            m_Bitmap.UnlockBits(m_BitmapInfo);
        }
        
        // Call this method to retrieve the color reference of pixel 
        // with coordinates (Row;Col)
        public Color GetPixel(Int32 Row, Int32 Col)
        {
            // Obtain the value of color depth
            Int32 Channel = System.Drawing.Bitmap.GetPixelFormatSize
                            (m_BitmapInfo.PixelFormat);
            // Compute the value of index of pixel in the m_Pixels buffer
            Int32 Pixel = (Row + Col * m_Bitmap.Width) * (Channel / 8);

            // Declaring three variables to be assigned the values of colors (R;G;B)
            Int32 Red = 0, Green = 0, Blue = 0, Alpha = 0;

            // Check if the bitmap is 32-bit color depth
            if (Channel == 32)
            {
                // Assign the values of specific elements in the 
                // array of pixels to corresponding variables
                Blue = m_Pixels[Pixel];
                Green = m_Pixels[Pixel + 1];
                Red = m_Pixels[Pixel + 2];
                Alpha = m_Pixels[Pixel + 3];
            }

            // Check if the bitmap is 24-bit color depth
            else if (Channel == 24)
            {
                // Assign the values of specific elements in the 
                // array of pixels to corresponding variables
                Blue = m_Pixels[Pixel];
                Green = m_Pixels[Pixel + 1];
                Red = m_Pixels[Pixel + 2];
            }

            // Check if the bitmap is 16-bit color depth
            else if (Channel == 16)
            {
                // Assign the values of specific elements in the 
                // array of pixels to corresponding variables
                Blue = m_Pixels[Pixel];
                Green = m_Pixels[Pixel + 1];
            }

            // Check if the bitmap is 8-bit color depth
            else if (Channel == 8)
            {
                // Assign the value of specific element in the 
                // array of pixels to corresponding variables
                Blue = m_Pixels[Pixel];
            }

            // Construct the color reference object from the value of variables
            // assigned the specific values of colors
            return (Channel != 8) ? 
                Color.FromArgb(Red, Green, Blue) : Color.FromArgb(Blue, Blue, Blue);
        }
        
        // Call this method to set the color of pixel Clr with coordinates (Row;Col)
        public void SetPixel(Int32 Row, Int32 Col, Color Clr)
        {
            // Obtain the value of color depth
            Int32 Channel = 
                System.Drawing.Bitmap.GetPixelFormatSize(m_BitmapInfo.PixelFormat);
            // Compute the value of index of pixel in the m_Pixels buffer
            Int32 Pixel = (Row + Col * m_Bitmap.Width) * (Channel / 8);

            // Check if the bitmap is 32-bit color depth
            if (Channel == 32)
            {
                // If so, assign the value of each color (R;G;B;A) to the specific
                // elements in the array of pixels
                m_Pixels[Pixel] = Clr.B;
                m_Pixels[Pixel + 1] = Clr.G;
                m_Pixels[Pixel + 2] = Clr.R;
                m_Pixels[Pixel + 3] = Clr.A;
            }

            // Check if the bitmap is 24-bit color depth
            else if (Channel == 24)
            {
                // If so, assign the value of each color (R;G;B) to the specific
                // elements in the array of pixels
                m_Pixels[Pixel] = Clr.B;
                m_Pixels[Pixel + 1] = Clr.G;
                m_Pixels[Pixel + 2] = Clr.R;
            }

            // Check if the bitmap is 16-bit color depth
            else if (Channel == 16)
            {
                // If so, assign the value of each color (B;G) to the specific
                // elements in the array of pixels
                m_Pixels[Pixel] = Clr.B;
                m_Pixels[Pixel + 1] = Clr.G;
            }

            // Check if the bitmap is 8-bit color depth
            else if (Channel == 8)
            {
                // If so, assign the value of each color (B) to the specific
                // element in the array of pixels
                m_Pixels[Pixel] = Clr.B;
            }
        }

        // Use this property to retrieve the width and height of an image locked
        public Int32 Width { get { return m_Bitmap.Width; } }
        public Int32 Height { get { return m_Bitmap.Height; } }

        // Use this method to save the bitmap to file
        public void Save(string filename)
        {
            // Calling Save(filename) method to save the bitmap to a file
            m_Bitmap.Save(filename);
        }

        // Declaring Bitmap class object
        public Bitmap m_Bitmap = null;

        // Declaring Rectangle class Object
        private Rectangle m_rect;
        // Declaring point to bitmap pixels buffer
        private IntPtr m_BitmapPtr;
        // Declaring array of pixels;
        private byte[] m_Pixels = null;
        // Declaring BitmaData object
        private BitmapData m_BitmapInfo = null;
    }

When the LockedBitmap class object is initialized, the appropriate constructor is invoked. In this code, we use multiple constructor declarations to either load bitmap from a file, create an empty bitmap of specified width and height, copy one bitmap object to another one using shallowed copy mechanism. When one of those three constructors are invoked, we're performing a check if the interval generic Bitmap class object is not initialized so far, if so, we're initializing that object using new operator. Also, at this point, we're initializing generic Rectangle class object to manipulate a rectangular area of a bitmap having a specific size. After that, we need to call LockBits()/UnlockBits() methods prior to performing various operations on the bitmap's matrix of pixels by invoking GetPixel(...)/SetPixel(...) methods. By calling these two methods, we actually lock or unlock the internal memory buffer that hold the matrix of bitmap's pixels. To do this, we invoke generic Bitmap.LockBits(...) method that causes the internal buffer to be locked. After that, we're obtaining the pointer to the first line of pixels in the following internal buffer. Finally, we use the generic marshalling copy method to actually move the elements of the internal buffer to the array of pixels previously allocated. Similarly, we're performing an unlock by calling the UnlockBits() method, during the execution of which we're obtaining the pointer to the first line of the bitmap's internal buffer and invoking the same marshalling copy method to move the array of pixels back to the bitmap's internal buffer.

Those methods are also re-used in LockedBitmap class object. Actually, during the following methods execution, we normally compute the absolute value of index of a specific pixel with coordinates (Row; Col) in the interval array of pixels. Since we've obtained the actual value of index, we either retrieve or set the color reference value of a given pixel by calling those methods mentioned above.

Another class we're about to discuss is the KMCFrame which is used to store the data on each cluster created during k-means clustering procedure. Let's remind that a "cluster" is actually an object having such parameters as an image object containing currently segmented area of the original image, the array of centroid super-pixels, central super-pixel computed as a center of mass for all centroids:

    class KMCFrame
    {
        // KMCFrame Constructor
        public KMCFrame(LockedBitmap Frame, List<KMCPoint<int>> Centroids, 
                        KMCPoint<int> Center)
        {
            this.Frame = Frame;
            this.m_Centroids = Centroids; this.Center = Center;
        }

        // Bitmap Frame Property
        public LockedBitmap Frame
        {
            get { return m_Frame; }
            set { m_Frame = value;  }
        }
        // Centroids List Property
        public List<KMCPoint<int>> Centroids
        {
            get { return m_Centroids; }
            set { m_Centroids = value; }
        }
        // Central Super-Pixel Property
        public KMCPoint<int32> Center
        {
            get { return m_Center; }
            set { m_Center = value; }
        }
        // Bitmap Frame Object
        private LockedBitmap m_Frame = null;
        // Central Super-Pixel Point Object
        private KMCPoint<double><int32> m_Center;
        // Array of Super-Pixel Objects (i.e. Centroids)
        private List<KMCPoint<int>><kmcpoint<int> m_Centroids = null;
    }

Obviously that, the following class contains three private member variables such as m_Frame which is an object of class Bitmap that actually is used to store the data on the image associated with the following cluster, m_Centroids - an object of List<KMCPoint<int>> class used to store an array of centroid super-pixels for the current cluster, m_Center - the central super-pixel, computed as a center of mass for all existing in m_Centroids array super-pixels. Also, the three corresponding properties are declared in this class and used to access and modify those private fields. The constructor of this class normally accepts three parameters whose values are assigned to specific private variables accessed by their properties during the KMCFrame class instantiation.

Another data structure we need to implement is a KMCClusters class and derive it from IEnumerable<KMCFrame> template. This class is used as the array of clusters. Besides the encapsulation of the List<KMCFrame> class object, the following class also contains the number of public methods we're about to discuss right now.

In particular, the following class contains void Init(string Filename, Int32 Distance, Int32 Offset) method that will further be used to initialize the empty array of clusters. The following code in C# demonstrates the implementation of the following method:

        public void Init(string Filename, Int32 Distance, Int32 Offset)
        {
            // Declare a bitmap object to load and 
            // use the original image to be segmented
            LockedBitmap FrameBuffer = new LockedBitmap(Filename);

            // Initialize the array of super-pixels 
            // by creating List<KMCPoint<int>> class object
            List<KMCPoint<int>> Centroids = new List<KMCPoint<int>>();
            // Generate an initial array of super-pixels of the original source image
            // stored in the FrameBuffer bitmap object
            this.Generate(ref Centroids, FrameBuffer, Distance, Offset);

            // Compute the value of the centeral super-pixel coordinates and assign it
            // to the Mean local variable
            KMCPoint<int> Mean = this.GetMean(FrameBuffer, Centroids);

            // Append an initial cluster being initialized to the array of clusters
            m_Clusters.Add(new KMCFrame(FrameBuffer, Centroids, Mean));
        }

The following method accepts values of three parameters passed as an argument including the filename of original source image, the values of distance and color offset parameters. When the following method is invoked, first, it constructs a bitmap object FrameBuffer and pass the value of first argument Filename to its constructor. This object, when it's created, loads the original image from file and assigns the data on each pixel of the original image to the specific interval private data structures such as a matrix of pixels. After the original image is loaded, the array of centroid super-pixels is instantiated by constructing an object of List<KMCPoint<int>> template class and the specific method Generate(...) is invoked to randomly generate a set of super-pixels for the initial cluster being created.

After we've initialized the specific data structures to hold the data on the image and array of super-pixel of the initial cluster, now we perform a computation of the central super-pixel coordinates and assign it to one of the parameters of KMCFrame class object being created. After that, we invoke method m_Clusters.Add(new KMCFrame(FrameBuffer, Centroids, Mean)); to append the initial cluster data to the array of clusters.

The method Generate(...) is a special interest in this code since it implements the algorithm that allows to randomly create an array of super-pixels for the initial cluster:

        public void Generate(ref List<kmcpoint<int>> Centroids, 
               LockedBitmap ImageFrame, Int32 Distance, Int32 Offset)
        {
            // Compute the number of iterations performed by the main loop 
            // equal to image W * H
            // The following value is the maximum possible number of random 
            // super-pixel being generated
            Int32 Size = ImageFrame.Width * ImageFrame.Height;
            ImageFrame.LockBits();
            // Performing Size - iterations of the following loop 
            // to generate a specific amount of super-pixels
            for (Int32 IterCount = 0; IterCount < Size; IterCount++)
            {
                // Obtain a random value of X - coordinate of the current super-pixel
                Int32 Rand_X = rand.Next(0, ImageFrame.Width);
                // Obtain a random value of Y - coordinate of the current super-pixel
                Int32 Rand_Y = rand.Next(0, ImageFrame.Height);

                // Create and instantiate a point object by using the values of
                // Rand_X, Rand_Y and Colorref parameters. The value of colorref is
                // retrieved by using the GetPixel method for the current bitmap object
                KMCPoint<int> RandPoint = new KMCPoint<int>(Rand_X,
                              Rand_Y, ImageFrame.GetPixel(Rand_X, Rand_Y));

                // Performing a validity check if none of those super-pixel previously
                // selected don't exceed the distance and color offset boundary to the 
                // currently generated super-pixel with coordinates Rand_X and Rand_Y 
                // and specific color stored as a parameter value of Clr variable
                if (!this.IsValidColor(Centroids, RandPoint, Offset) &&
                    !this.IsValidDistance(Centroids, RandPoint, Distance))
                {
                     // If not, check if the super-pixel with the following 
                     // coordinates and color already exists in the array of centroid
                     // super-pixels being generated.
                     if (!Centroids.Contains(RandPoint))
                     {
                         // If not, append the object RandPoint to the array 
                         // of super-pixel objects
                         Centroids.Add(RandPoint);
                     }
                }
            }

            ImageFrame.UnlockBits();
        }

During the following method execution, we initially compute the total number of iterations that exactly correspond to the maximum possible amount of random super-pixels, and perform the series of iterations which number equals the total number iterations previously obtained. During each particular iteration, we generate two random values in the interval from [0; ImageFrame.Width] and [0; ImageFrame.Height] respectively, to obtain the coordinates of the current super-pixel. After that, we construct the KMCPoint class object with previously obtained values passed as the arguments to the following object's constructor. Also, since the coordinates of a new super-pixel were obtained, we additionally obtain the color data from the matrix of image pixels by invoking ImageFrame.GetPixel(Rand_X, Rand_Y) method.

Before appending the super-pixel object being constructed, we're performing a check if the super-pixel coordinates and color being generated exactly meets the conditions discussed above. For that purpose, we use two methods of either this.IsValidDistance(Centroids, RandPoint, Distance) or this.IsValidColor(Centroids, RandPoint, Offset) to perform the super-pixel validity verification. The following fragments of code in C# implement those two methods mentioned above:

        private bool IsValidDistance(List<kmcpoint<int>> Points, 
                     KMCPoint<int> Target, Int32 Distance)
        {
            Int32 Index = -1; bool Exists = false;
            // Iterate through the array of super-pixels until we've found 
            // the super-pixel which distance to the target super-pixel 
            // is less than or equals to the specified boundary
            while (++Index < Points.Count() && !Exists)
                // For each super-pixel from the array, we compute the value of 
                // distance and perform a check 
                // if the following value is less than or equals to 
                // the value of specific boundary parameter.
                Exists = ((Math.Abs(Target.X - Points.ElementAt(Index).X) <= Distance) ||
                          (Math.Abs(Target.Y - Points.ElementAt(Index).Y) <= Distance)) ? 
                           true : false;

            return Exists;
        }
        private bool IsValidColor(List<kmcpoint<int>> Points, KMCPoint<int> Target, 
                                  Int32 Offset)
        {
            Int32 Index = -1; bool Exists = false;
            // Iterate through the array of super-pixels until we've found 
            // the super-pixel which color offset to the target super-pixel 
            // is less than or equals to the specified boundary
            while (++Index < Points.Count() && !Exists)
                // For each super-pixel from the array, we compute the value of 
                // color offset and perform a check 
                // if the following value is less than or equals to 
                // the value of specific boundary parameter.
                Exists = (Math.Sqrt(Math.Pow(Math.Abs(Points[Index].Clr.R - 
                                    Target.Clr.R), 2) +
                                    Math.Pow(Math.Abs(Points[Index].Clr.G - 
                                    Target.Clr.G), 2) +
                                    Math.Pow(Math.Abs(Points[Index].Clr.B - 
                                    Target.Clr.B), 2))) <= Offset ? true : false;

            return Exists;
        }

By executing these methods, we normally iterate throught the array of super-pixels being generated and for each particular item, compute either distance or color offset to the new super-pixel which coordinates have been previously generated during Generate method execution. Also, we're performing a check if either distance or color offset to the new super-pixel does not exceed the specified boundary. If so, we're breaking the loop execution and returning the specific result value. Particularly, if the array of super-pixels contains at least one super-pixel which distance or color offset relative to the new super-pixel is less than or equal to the value of specific boundary parameter, this method returns false. It means that the new super-pixel is not a valid and cannot be added to the array of super-pixels. To compute the actual distance and color offset, we typically use the same Euclidian distance formula discussed in the background section of this article.

Additionally, in this class, we're implementing one more important method KMCPoint<int> GetMean(Bitmap FrameBuffer, List<KMCPoint<int>> Centroids) used to compute the central super-pixel coordinates:

        public KMCPoint<int> GetMean(LockedBitmap FrameBuffer, 
                                     List<KMCPoint<int>> Centroids)
        {
            // Declaring two variables to assign the value of the "mean" of 
            // the sets of coordinates (X;Y) of each super-pixel
            double Mean_X = 0, Mean_Y = 0;
            // Iterating through the array of super-pixels and for each
            // super-pixel retrieve its X and Y coordinates and divide it
            // by the overall amount of super-pixels. After that, sum up
            // each value with the values of Mean_X and Mean_Y variables
            for (Int32 Index = 0; Index < Centroids.Count(); Index++)
            {
                Mean_X += Centroids[Index].X / (double)Centroids.Count();
                Mean_Y += Centroids[Index].Y / (double)Centroids.Count();
            }

            // Convert the values of Mean_X and Mean_Y to Int32 datatype
            Int32 X = Convert.ToInt32(Mean_X);
            Int32 Y = Convert.ToInt32(Mean_Y);

            FrameBuffer.LockBits();
            Color Clr = FrameBuffer.GetPixel(X, Y);
            FrameBuffer.UnlockBits();

            // Constructing KMCPoint<int> object and return its value
            return new KMCPoint<int>(X, Y, Clr);
        }

During the following method execution, we normally iterate through the array of super-pixels previously generated and for each super-pixel, retrieve the either X- or Y - coordinate and divide it by the overall number of super-pixels in the array. Further, we accumulate each value obtained into the Mean_X and Mean_Y sum variable. As you've might noticed the "mean" value of either X - or Y - coordinate is computed separately. After that, those values of Mean_X and Mean_Y are converted to Int32 datatype. By using these values as well as the value of color of the super-pixel retrieved from the original image, we finally construct the KMCPoint<int> class object and pass those values to its constructors arguments.

The most fundamental class we're about discuss at this point is the ImageSegmentation class implementation. In the following class, we normally declare the method public void Compute(string InputFile, string OutputFile) which is the main of the image segmentation engine that performs the actual k-mean clustering. Also, besides this method, we'll implement two more simple methods that perform the Euclidian distance computation, which we'll discuss more about later.

Here's the implementation of the mentioned above Compute method:

        public void Compute(string InputFile, string OutputFile)
        {
            // Initialize the code execution timer
            var watch = System.Diagnostics.Stopwatch.StartNew();

            // Initialize the directory info reference object
            DirectoryInfo dir_info = new DirectoryInfo("Clusters");
            // Check if the directory with name "Clusters" is created.
            // If not, create the directory with name "Clusters"
            if (dir_info.Exists == false) dir_info.Create();

            // Initialize the array of clusters by generating an initial cluster
            // containing the original source image associated 
            // with the array of super-pixels
            m_Clusters.Init(InputFile, m_Distance, m_OffsetClr);

            // Initialize the bitmap object used to store the resulting segmented image
            LockedBitmap ResultBitmap = 
            new LockedBitmap(m_Clusters[0].Frame.Width, m_Clusters[0].Frame.Height);

            Int32 FrameIndex = 0;
            // Iterate throught the array of clusters until we've processed 
            // all clusters being generated
            for (Int32 Index = 0; Index < m_Clusters.Count(); Index++)
            {
                // For each particular cluster from the array, 
                // obtain the values of bitmap object and
                // the List<KMCPoint<int>> object which is the array of 
                // centroid super-pixels
                List<KMCPoint<int>> Centroids = m_Clusters[Index].Centroids.ToList();
                LockedBitmap FrameBuffer = 
                      new LockedBitmap(m_Clusters[Index].Frame.m_Bitmap);

                // Save the image containing the segmented area associated with 
                // the current cluster to the specific file, 
                // which name has the following format, 
                // for example "D:\Clusters\Cluster_N.jpg"
                FrameBuffer.Save("Clusters\\Cluster_" + FrameIndex + ".jpg");

                FrameBuffer.LockBits();

                // Iterating through the array of centroid super pixels 
                // and for each super-pixel, perform a linear search 
                // to find all those pixels in the current image whose distance
                // does not exceed the value of specific boundary parameter.
                for (Int32 Cnt = 0; Cnt < Centroids.Count(); Cnt++)
                {
                    // Obtain the value of Width and Height of the image 
                    // for the current cluster
                    Int32 Width = FrameBuffer.Width;
                    Int32 Height = FrameBuffer.Height;

                    // Create a bitmap object to store an image for 
                    // the newly built cluster
                    LockedBitmap TargetFrame = new LockedBitmap
                                      (FrameBuffer.Width, FrameBuffer.Height);

                    TargetFrame.LockBits();

                    // Iterate through each element of the matrix of pixels 
                    // for the current image
                    for (Int32 Row = 0; Row < FrameBuffer.Width; Row++)
                    {
                         for (Int32 Col = 0; Col < Height; Col++)
                         {
                              // For each pixel in this matrix, 
                              // compute the value of color offset of 
                              // the current centroid super-pixel
                              double OffsetClr = GetEuclClr(new KMCPoint<int>
                                     (Row, Col, FrameBuffer.GetPixel(Row, Col)),
                                      new KMCPoint<int>(Centroids[Cnt].X, 
                                      Centroids[Cnt].Y, Centroids[Cnt].Clr));

                              //Perform a check if the color offset value 
                              //does not exceed the value of boundary parameter
                              if (OffsetClr <= 50)
                              {
                                  // Copy the current pixel to the target image 
                                  // for the newly created cluster
                                  TargetFrame.SetPixel(Row, Col, Centroids[Cnt].Clr);
                              }

                              // Otherwise, set the color of the current pixel 
                              // to "white" (R;G;B) => (255;255;255)
                              // in the target bitmap for the newly built cluster
                              else TargetFrame.SetPixel
                                   (Row, Col, Color.FromArgb(255, 255, 255));
                         }
                    }

                    TargetFrame.UnlockBits();

                    // Create an array of centroid super-pixels and append 
                    // it the value of current centroid super-pixel retrieved
                    List<KMCPoint<int>> TargetCnts = new List<KMCPoint<int>>();
                    TargetCnts.Add(Centroids[0]);

                    // Compute the "mean" value for the newly created cluster
                    KMCPoint<int> Mean = m_Clusters.GetMean(TargetFrame, TargetCnts);

                    // Perform a check if the "mean" point coordinates 
                    // of the newly created cluster are
                    // not equal to the coordinates of the current 
                    // centroid super-pixel (e.g. the centroid
                    // super-pixel has been moved). If so, 
                    // append a newly built cluster to the array of clusters
                    if (Mean.X != m_Clusters[Index].Center.X && 
                        Mean.Y != m_Clusters[Index].Center.Y)
                         m_Clusters.Add(TargetFrame, TargetCnts, Mean);

                     FrameIndex++;
                }

                FrameBuffer.UnlockBits();
            }

            ResultBitmap.LockBits();

            // Iterate through the array of clusters previously obtained
            for (Int32 Index = 0; Index < m_Clusters.Count(); Index++)
            {
                 // For each cluster, retrieve a specific image 
                 // containing the segmented area
                 LockedBitmap FrameOut = new LockedBitmap
                                         (m_Clusters[Index].Frame.m_Bitmap);

                 FrameOut.LockBits();

                 FrameOut.Save("temp_" + Index + ".jpg");

                 // Obtain the dimensions of that image
                 int Width = FrameOut.Width, Height = FrameOut.Height;
                 // Iterate through the matrix of pixels 
                 // for the current image and for each
                 // pixel, perform a check if its color is not equal to 
                 // "white" (R;G;B) => (255;255;255).
                 // If not, copy the pixel data to the target matrix of pixels 
                 // for the resulting segmented image
                 for (Int32 Row = 0; Row < Width; Row++)
                 {
                      for (Int32 Col = 0; Col < Height; Col++)
                      {
                           if (FrameOut.GetPixel(Row, Col) != 
                                        Color.FromArgb(255, 255, 255))
                           {
                               ResultBitmap.SetPixel(Row, Col, 
                                     FrameOut.GetPixel(Row, Col));
                           }
                      }
                 }

                 FrameOut.UnlockBits();
            }

            ResultBitmap.UnlockBits();

            // Save the segmented image to file with name 
            // which is the value of OutputFile variable
            ResultBitmap.Save(OutputFile);

            watch.Stop(); // Stop the execution timer
            // Obtain the value of executing time in milliseconds
            var elapsedMs = watch.ElapsedMilliseconds;

            // Create timespan from the elapsed milliseconds value
            TimeSpan ts = TimeSpan.FromMilliseconds(elapsedMs);
            // Print the message "Done" and the formatted execution time
            Console.WriteLine("***Done***\n" + ts.ToString(@"hh\:mm\:ss"));
        }

While performing k-means clustering computations, we actually need to initialize the array of clusters by generating and appending an intial cluster to the following array prior to executing the iterations of the main loop. To do this, we invoke the m_Clusters.Init(...) method, thoroughly discussed above. After that, we enter the main process loop. During each iteration of the following main loop, we retrieve the data on the current image and array of centroid super-pixels assigned to the specific data structures for the current cluster object. After we've obtained the array of centroid super-pixels, we step through the elements of this array and for each super-pixel, we're aiming to find the subset of pixels in the current image which value of color offset does not exceed the specific boundary value (e.g., valid pixels). Since, we've obtained such subset, we build a new cluster based on performing a copy of those valid pixels into a new image and assigning the subset of new centroid super-pixels to the specific array for the newly built cluster. After that, we're performing a check if the "mean" super-pixel coordinates are not equal to the coordinates of the current centroid super-pixel in the parent cluster being taken. If not, we're constructing a new cluster object and append it to the array of clusters. At this point, we normally proceed with the following process for each particular centroid super-pixel of the current cluster retrieved from the array. Notice, that only the intial cluster might contain more than one centroid super-pixel, the other clusters are limited to contain just one super-pixel.

After we've completed the k-means clustering procedure, we need to actually "gather" the images for each cluster into a segmented resulting image. For that purpose, we're iterating through the array of clusters and retrieve a matrix of pixels of each particular image. Then, we iterate through the following matrix and perform a check if the color of the current pixel with specific coordinates is not equal to "white" 0xFFFFFF. If not, we're copying the following current pixel to the matrix of pixels for the resulting image.

The final aspect we're about to discuss in this section is the methods that perform a computation of the Euclidian distance between two pixels or two color hues:

        public double GetEuclD(KMCPoint<int> Point1, KMCPoint<int> Point2)
        {
            // Compute the Euclidian distance between two pixel in the 2D-space
            return Math.Sqrt(Math.Pow(Point1.X - Point2.X, 2) + 
                             Math.Pow(Point1.Y - Point2.Y, 2));
        }
        public double GetEuclClr(KMCPoint<int> Point1, KMCPoint<int> Point2)
        {
            // Compute the Euclidian distance between two colors in the 3D-space
            return Math.Sqrt(Math.Pow(Math.Abs(Point1.Clr.R - Point2.Clr.R), 2) +
                             Math.Pow(Math.Abs(Point1.Clr.G - Point2.Clr.G), 2) +
                             Math.Pow(Math.Abs(Point1.Clr.B - Point2.Clr.B), 2));
        }

The following methods are very simple and normally return the evaluated value of the Euclidian distance formula expression discussed in the background theory section of this article.

Finally, the code in C# listed below illustrates the programs main function along with the ImageSegmentation class object creation and initialization:

    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Image Segmentation Utility v.1.0a 
                               by Arthur V. Ratz, CPOL @ 2017\n");

            Console.Write("Input file name: ");
            string InpFile = Console.ReadLine();

            Console.Write("Output file name: ");
            string OutFile = Console.ReadLine();

            ImageSegmentation ImageSeg = new ImageSegmentation();
            ImageSeg.Compute(InpFile, OutFile);

            Console.ReadKey();
        }
    }

Points of Interest

The algorithm discussed in this article has many interesting variations based on using different techniques and approaches of machine learning and data mining including the using of Pearson correlation instead of Euclidian distance formula, implementing k-means++ initialization algorithm to improve the quality of clustering, using min-max selection approach to find all those pixels that have the closest distance to the specific super-pixel.

History

27^th August, 2017 - First revision of article published
29^th August, 2017 - Final revision of article published