Kohonen's Self Organizing Maps in C++ with Application in Computer Vision Area

Chesnokov Yuriy

4.88/5 (33 votes)

Nov 21, 2007

GPL3

18 min read

187580

7109

The article demonstrates the self organizing maps clustering approach for unsupervised AI classification tasks with application examples in computer vision area for faces clustering and recognition

Introduction
Background
Using the Code
The Library
SOM Application for Natural Images and Faces Clustering
Adaptive Image Segmentation

Introduction

The AI methods are classified into supervised and unsupervised categories. The first one is used for training the classifiers to distinguish the data when you know the classes present in your training sets and the latter is used when you are presented with unknown data and want to discover its structure. That is, you'd like to obtain the lower dimensional mapping 2D or 3D where you may easily find out how your initial high dimensional data groups together. For example, in the proverbial IRIS dataset, there are 3 categories of 4 dimensional objects and suppose you do not know in advance they are 3. If the dimensionality of the data was two or three you would have plotted them as a scatter plots and identified how many classes are there. But what if the dimensionality is 1000? You may consider self organizing maps (SOMs) as a kind of dimensionality reduction technique too. It groups similar samples from the unknown dataset on its grid together, that is, it adjusts its weight coefficients to be closer to the data samples you present it with. So it learns effectively the clusters present in the data. For IRIS set some SOM nodes would be similar to setosa, some to virgi, others to versi, and similar entries will be grouped together on the SOM's grid. If you compared the distances between adjacent SOM nodes weights, the similar ones would have closer distance between its weights and the nodes from different classes would have more distance. Having plotted the distances you will obtain the boundaries of your clusters and then just classify your unknown data samples to the discovered categories.

In this C++ lib, I introduced normalization of the data. You need to normalize it if some features have significantly increased numerical range (e.g. human height and weight. The first feature ranges about 1.50 to 2.20 meters and the second 50 - 150 kg. The latter prevails and your clustering results will be biased towards the weight). Apart from Euclidian distance I use sum of squared distances (SOSD), taxicab and angle between the vectors. I left unfinished Mahalanobis distance, but you may easily add the code. Just include inverse of covariance matrix for the training data you used. The data file formats that are supported in this code are described in my Backpropagation Artificial Neural Network in C++ article.

Another feature of the lib is that you may create SOM of any dimension. Typically it is 2D or 3D map, but with my code you may choose any number of dimensions for your map.

Background

You may learn about the SOM technique and the applications at the sites I used when I studied the topic: Kohonen's Self Organizing Feature Maps, Self-Organizing Nets, and Self Organizing Map AI for Pictures. Read them first before you move forward in my article.

Using the Code

The lib implementation is produced in console application for easier usage. The help line to console is as follows:

 1 t-train
 2 net.som
 3 class.txt
 4 epochs
  5 [test set]
 6 [slow] [slow=0]fast=1 training
 7 [dist] distance [0-euclidian],1-sosd,2-taxicab,3-angle
 8 [norm] normalize input data [0-none],1-minmax,2-zscore,3-softmax,4-energy

 1 r-run
 2 net.som
 3 class.txt

I've attached train.bat in the demo download so you may start clustering right now. For a simple example I used red, green and blue colors as 3 dimensional vectors present in rgbs.txt and rgbs1.txt files. The command line to train the SOM network is:

>som.exe t rgb.som rgbs.txt 500

Train the rgb.som network on rgbs.txt data for 500 epochs. The 6 to 8 parameters are optional and you may omit them. The 5th parameter is used only if you are going to use your network for classification of the 2 different classes. You may supply in it the test data to get performance accuracy immediately after training or as an empty file. In the latter case the random half from the training set will be used for testing. Otherwise omit the 5th parameter and use 6 - 8 in the same order immediately after the 4th.

>som.exe t rgb.som rgbs.txt 500  1 2 2

This is the same as the previous example but uses fast training mode (see below for description), taxicab as a distance metric and zscore data normalization.

After training 1 or 2 files will be saved depending on whether you cluster your data with unknown or known class membership. The distance map for adjacent SOM nodes will be saved to rgb_map file. If you do not know the class membership of your samples in advance, use the following entries without the class mark after the vector name:

name1
x1 x2 x3 ... xN
name2
x1 x2 x3 ... xN
...

Use the distance map file to discover the boundaries of the clusters. It will contain the same sized 2D vector as the SOM grid. The highest values correspond to the ridges separating different clusters. In that case only rgb_2.som file will be saved with zero marks for SOM node's class membership in the 2D grid. You've already defined your clusters from the distance map file. Now replace zeros in the SOM grid in rgb_2.som file for desired classes 1, 2, 3 ... N. To get more explanation on how to do it I will expound the next case if you know the class membership of your data and want to produce a multiclass classifier.

The rgbs.txt file contain 3 classes of red, green and blue colors (3D vectors). After training, 2 SOM files will be created rgb_1.som and rgb_2.som. The first one uses a voting scheme (see below for description) to attribute the SOM node's class membership from the training data and the latter uses a direct scheme for each SOM node to attribute it the class from the training set. If you open them in the text editor you will find a 2D map of 50 by 50 entries with numbers 1, 2 and 3 grouped together. This is the clustering of your data. Similar vectors are grouped together. Now plot rgb_map in Matlab with imshow(rgb_map, []) and use colormap('jet').

You see the 3 separated blue regions which are the 3 clusters found in your data. Your 3 classes in the 2D map of 50 by 50 entries (rgb_1.som and rgb_2.som) match those regions perfectly. Now if you had only rgb_2.som file with unknown classes (0 in the 50 by 50 2D map), you would have replaced the zeros as I described above with 3 different classes you found in rgb_map plotting and get your classifier.

The nodes of the SOM contain 3 dimensional weights vectors which are adjusted to resemble the training data. Having the advantage that we used RGB color data, we may just arrange 50x50x3 RGB color data and plot it.

Now you may see that the weights of each node are similar to the data from rgbs.txt file.

In the fast mode training I do not alter weights of the nodes which exceed the bounds of radius from the best matched unit (BMU) node, whereas in the previous example I modify the weights of all the nodes with exponentially decreased learning rule. The fast mode training however provides sharper boundaries:

And the 50x50x3 RGB color data is shown below:

Now you may choose random RGB color entries and try to cluster them. In the next picture, the 50x50x3 SOM map is shown obtained with clustering 2000 random RGB vectors:

Similar colors are clustered together.

The Library

Next I provide a more detailed description of the classes in the project.

The C++ SOM lib contains 2 classes:

Node
SOM

The Node is the unit which constitutes the SOM map. It is initialized upon construction with specific weights vector of length weights_number, its coordinates in the N dimensional SOM map coords vector with coords_number coordinates (x, y, z, ... ) and optional class membership class_:

Node::Node(const float *weights, int weights_number, 
    const float *coords, int coords_number, int class_ = 0);

For 2D SOM map each node has two coordinates (x,y) for 3D map (x,y,z) and so on, but you will not be able to visualize 4 and more dimensional SOMs unless you use them as classifiers only. The weights_number is equal to the dimensionality of the data you use to cluster or classify. For RGB color triplet the data vector is 3 dimensional.

To determine the distance from the node's weight vector to specific data vector from the dataset, the get_distance() function is defined:

inline float Node::get_distance(const float *vec, 
    enum DistanceMetric distance_metric, const float **cov = 0) const;

The distance_metric is one of the metrics used to compute the distance: enum DistanceMetric {EUCL, SOSD, TXCB, ANGL, MHLN}; Euclidian, sum of squared distances, taxicab, angle or mahalanobis. The latter is reserved for future code development (before training you will need to estimate inverse of covariance matrix of your train set which will be passed in the optional parameter cov).

inline float Node::get_distance
    (const float *vec, enum DistanceMetric distance_metric, const float **cov) const
{
        float distance = 0.0f;
        float n1 = 0.0f, n2 = 0.0f;

        switch (distance_metric) {
        default:
        case EUCL:   //euclidian
                if (m_weights_number >= 4) {
                        distance = mse(vec, m_weights, m_weights_number);
                } else {
                        for (int w = 0; w < m_weights_number; w++)
                                distance += (vec[w] - m_weights[w]) *
                            (vec[w] - m_weights[w]);
                }
                return sqrt(distance);

        case SOSD:   //sum of squared distances
                if (m_weights_number >= 4) {
                        distance = mse(vec, m_weights, m_weights_number);
                } else {
                        for (int w = 0; w < m_weights_number; w++)
                                distance += (vec[w] - m_weights[w]) *
                (vec[w] - m_weights[w]);
                }
                return distance;

        case TXCB:   //taxicab
                for (int w = 0; w < m_weights_number; w++)
                        distance += fabs(vec[w] - m_weights[w]);
                return distance;

        case ANGL:   //angle between vectors
                for (int w = 0; w < m_weights_number; w++) {
                        distance += vec[w] * m_weights[w];
                        n1 += vec[w] * vec[w];
                        n2 += m_weights[w] * m_weights[w];
                }
                return acos(distance / (sqrt(n1)*sqrt(n2)));

        //case MHLN:   //mahalanobis
                //distance = sqrt(m_weights * cov * vec)
                //return distance
        }
}

The mse() is just the SSE optimized function returning mean squared error between 2 vectors.

To adjust the node's weight vector to specific data vector from your train set, the train() function is used:

inline void Node::train(const float *vec, float learning_rule);

inline void Node::train(const float *vec, float learning_rule)
{
        for (int w = 0; w < m_weights_number; w++)
                m_weights[w] += learning_rule * (vec[w] - m_weights[w]);
}

learning_rule allows you to control the amount the weights are changed. It starts from about 0.9 and exponentially decreases during training process as you may have learned from the articles I suggested in the background section.

To create or load an SOM map from the file, there are 2 constructors available:

SOM::SOM(int dimensionality, int *dimensions, int weights_per_node, 
    enum Node::DistanceMetric distance_metric, float *add = 0, 
        float *mul = 0);

```
SOM::SOM(const wchar_t *fname);
```

With the first one you may create an N dimensional SOM map with number of dimensions in the grid as dimensionality, specifying the size of each dimension in dimensions. The weights_per_node is the dimensionality of your training data samples, distance_metric is self evident in its name and add, mul are optional normalization vectors. Each data vector fed to SOM is normalized as x(i) = (x(i) + add(i)) * mul(i). By default add is filled with zeros and mul is filled with ones, so no normalization occurs. The node's weights will be initialized with random numbers.

int dimensions[] = {50, 50};
//create 2 dimensional 50 by 50 SOM map for 3 dimensional data clustering
//(RGB or any other)
SOM rgb_som(2, dimensions, 3, Node::EUCL);

int dimensions[] = {10, 10, 10, 10}
//create 4 dimensional 10x10x10x10 SOM map for 10 dimensional data clustering
SOM rgb_som(4, dimensions, 10, Node::ANGL);

You may also load the SOM network from the file with the second constructor. You may use either a reduced file format, specifying only SOM dimensionality, dimension sizes and weights per node as in the rgb.som file in the demo zip file or a full file format with actual weights coefficients. The reduced file format allows to produce random weights values before clustering:

>rgb.som file, 2D SOM 50 by 50 with 3 weights per node
2
50 50
3

The full SOM file format is shown on the example of 2D SOM map 5 by 5 used to cluster rgbs.txt data:

2
5 5
3
Eucl

 2 2 2 3 3
 2 2 2 3 3
 1 1 1 3 3
 1 1 1 3 3
 1 1 1 1 3

None
0        1
0        1
0        1

0 0
0.00398814
0.987988
0.0100357

1 0
0.01
0.94
0.03

2 0
0.00902465
0.360081
0.618271

3 0
0.01
0.02
0.97

4 0
0.0133332
0.00336507
0.996667

0 1
0.0198603
0.956393
0.0118624

1 1
3.37824e-013
0.91
0.01

2 1
0.0197248
0.368034
0.581373

3 1
0.0191355
0.0162165
0.950495

4 1
0.0150343
0.00503427
0.975034

0 2
0.753054
0.228725
0.0101103

1 2
0.643217
0.28021
0.046389

2 2
0.222105
0.0906172
0.651532

3 2
0.03
0.01
0.92

4 2
0.03
0.01
0.94

0 3
0.975034
0.0150343
0.00503427

1 3
0.94
0.03
0.01

2 3
0.742754
0.0188995
0.198852

3 3
0.152863
0.0109908
0.787188

4 3
2.79205e-013
0.01
0.91

0 4
0.996667
0.0133332
0.00336507

1 4
0.97
0.01
0.02

2 4
0.915034
0.0151028
0.01

3 4
0.65243
0.014252
0.276088

4 4
0.0573598
0.010243
0.871152

Eucl means Euclidian distance for calculating BMU node. Other possible distances strings are defined in SOM::g_distance[5][5]. Next the 5 by 5 nodes class membership marks presented. In that case 1 is the red color vector, 2 - green and 3 - blue. In case you do not know in advance the class membership of your unknown data, there will be 0's instead, meaning unknown class membership node. None is the normalization of the data fed to SOM, with first column as the add vector and second as the mul vector, see above for normalization formula. The normalization strings are defined at SOM::g_normalization[5][5] as no normalization, minmax, zscore, sigmoid and energy. Next are the coordinates of the node in the SOM grid and its weight vector. So (0,0) is the node at x = 0 and y = 0 position in the SOM 2D grid and so on.

After the constructor calls int SOM::status(void) to determine if you successfully loaded the file, negative value indicates the error, zero - success and positive means random weights.

You may specify externally the distance metric, training mode and normalization parameters with those functions:

inline void SOM::set_distance_metric(enum Node::DistanceMetric distance_metric);

inline void SOM::set_train_mode(enum TrainMode train_mode);

void SOM::compute_normalization(PREC rec, enum Normalization norm);

The PREC is the data structure where you load your training data with read_class() function defined in the console main body, see stdafx.h for description.

After you created the SOM map you may start clustering the data.

void SOM::train(const vector<float *> *vectors, float R, float learning_rule);

The train() function runs one epoch of training session. It uses your data vectors stored in vectors std array and adjusts its weights within R radius from the BMU nodes using learning_rule. I use exponential decrease for R and learning_rule this way in my console code:

float R, R0 = som->R0();
float nrule, nrule0 = 0.9f;

int epochs = _wtoi(argv[4]);
float N = (float)epochs;
int x = 0;  //0...N-1

while (epochs) {
        //exponential shrinking
        R = R0 * exp(-10.0f * (x * x) / (N * N));          //radius shrinks over time
        //learning rule shrinks over time
        nrule = nrule0 * exp(-10.0f * (x * x) / (N * N));

        x++;

        som->train(&train_vectors, R, nrule);

        wprintf(L"  epoch: %d    R: %.2f nrule: %g \n", (epochs-- - 1), R, nrule);

        if (kbhit() && _getwch() == 'q') //quit program ?
                epochs = 0;
}

The SOM::R0() returns half the distance from the longest dimension size of the SOM map. For 20 by 50 SOM map the initial radius equals 50 / 2 = 25 within which the neighbouring nodes from the BMU node will be updated.

float SOM::R0() const
{
        float R = 0.0f;

        for (int i = 0; i < m_dimensionality; i++)
                if ((float)m_dimensions[i] > R)
                        R = (float)m_dimensions[i];

        return R / 2.0f;
}

The training function is shown below:

void SOM::train(const vector<float *> *vectors, float R, float learning_rule)
{
        for (int n = 0; n < (int)vectors->size(); n++) {

                const float *pdata = normalize(vectors->at(n));

                //get best node
                Node *bmu_node = get_bmu_node(pdata);
                const float *p1 = bmu_node->get_coords();

                if (R <= 1.0f)  //adjust BMU node only
                        bmu_node->train(pdata, learning_rule);
                else {
                        //adjust weights within R
                        for (int i = 0; i < get_nodes_number(); i++) {

                                const float *p2 = m_nodes[i]->get_coords();
                                float dist = 0.0f;

                                //dist = sqrt((x1-y1)^2 + (x2-y2)^2 + ...)
                                //distance to node in SOM grid
                                for (int p = 0; p < m_dimensionality; p++)
                                        dist += (p1[p] - p2[p]) * (p1[p] - p2[p]);
                                dist = sqrt(dist);

                                if (m_train_mode == FAST && dist > R)
                                        continue;

                                float y = exp(-(1.0f * dist * dist) / (R * R));
                                m_nodes[i]->train(pdata, learning_rule * y);
                        }
                }
        }
}

Now you may have a look at what is the SLOW and FAST training modes are. If the selected node's distance to the BMU node exceeds current R, then the selected node does not adjust its weights towards input train vector in FAST training mode. And when R falls below 1.0f, the fine tuning phase is implemented when only BMU nodes weights are adjusted.

The normalize() function just normalizes input data vectors with specified normalization type:

const float* SOM::normalize(const float *vec)
{
        switch (m_normalization) {
        default:
        case NONE:
                for (int x = 0; x < get_weights_per_node(); x++)
                        m_data[x] = vec[x];
                break;

        case MNMX:
                for (int x = 0; x < get_weights_per_node(); x++)
                        m_data[x] = (0.9f - 0.1f) *
                        (vec[x] + m_add[x]) * m_mul[x] + 0.1f;
                break;

        case ZSCR:
                for (int x = 0; x < get_weights_per_node(); x++)
                        m_data[x] = (vec[x] + m_add[x]) * m_mul[x];
                break;

        case SIGM:
                for (int x = 0; x < get_weights_per_node(); x++)
                        m_data[x] = 1.0f / (1.0f +
                        exp(-((vec[x] + m_add[x]) * m_mul[x])));
                break;

        case ENRG:
                float energy = 0.0f;
                for (int x = 0; x < get_weights_per_node(); x++)
                        energy += vec[x] * vec[x];
                energy = sqrt(energy);

                for (int x = 0; x < get_weights_per_node(); x++)
                        m_data[x] = vec[x] / energy;
                break;
        }
        return m_data;
}

The normalization parameters you compute beforehand with compute_normalization() function select one of the available normalization methods: enum Normalization {NONE, MNMX, ZSCR, SIGM, ENRG};

If you are just clustering unknown data vectors then you may proceed to save the SOM network and the distance map:

```
int save(const wchar_t *fname) const;
```

int save_2D_distance_map(const wchar_t *fname) const;

The distance map is currently supported for 2D SOM maps only, as for higher dimensional ones you'd rather be not able to visualize it. To create the distance map the MSE is computed for the weights of adjacent nodes. If the nodes belong to the same cluster, their weights will not differ too much. On the contrary, for the adjacent nodes on the clusters boundaries the MSE of their weights will have higher values:

int SOM::save_2D_distance_map(const wchar_t *fname) const
{
        int D = 2;
        float min_dist = 1.5f;

        if (get_dimensionality() != D)
                return -1;

        FILE *fp = _wfopen(fname, L"wt");
        if (fp != 0) {
                int n = 0;
                for (int i = 0; i < m_dimensions[0]; i++) {
                        for (int j = 0; j < m_dimensions[1]; j++) {
                                float dist = 0.0f;
                                int nodes_number = 0;
                                const Node *pnode = get_node(n++);
                                for (int m = 0; m < get_nodes_number(); m++) {
                                        const Node *node = get_node(m);
                                        if (node == pnode)
                                                continue;
                                        float tmp = 0.0;
                                        for (int x = 0; x < D; x++)
                                                tmp += pow(*(pnode->get_coords() + x) -
                                                       *(node->get_coords() + x), 2.0f);
                                        tmp = sqrt(tmp);
                                        if (tmp <= min_dist) {
                                                nodes_number++;
                                                dist += pnode->get_distance
                                                (node->m_weights, m_distance_metric);
                                        }
                                }
                                dist /= (float)nodes_number;
                                fwprintf(fp, L" %f", dist);
                        }
                        fwprintf(fp, L"\n");
                }
                fclose(fp);
                return 0;
        }
        else
                return -2;
}

Now if you want to cluster your data with known class membership and find out if the data groups together according to the membership, you need to assign the classes to the SOM's nodes after the clustering process finished. You may use either the voting scheme or the direct class assignment scheme.

```
void vote_nodes_from(PREC rec);
```
```
void assign_nodes_to(PREC rec);
```

With the voting scheme you present the vectors from your training set to the SOM. On each vector only one BMU node will be closest to it. You increase the vote count for that vector's class on that BMU node. After all training vectors have been presented to the SOM, by majority of the votes for a particular class on the single node you assign the class membership to that node. Suppose the 3 classes are present in your data set and some node 'fired' 3 times for 1 class, 20 times for 2nd and 5 times for 3rd. By majority vote, you assign it to the 2nd class. The degree you may trust that node during classification is 20 / (3 + 20 + 5) = 0.71 out of 1.0.

void SOM::vote_nodes_from(PREC rec)
{
        //rec->clsnum = [cls 1][cls 2] ... [cls N]   N entries
        //example: 0,1,2  3,1,2   1,4,10 missed classes

        //clear votes for classes of all nodes
        for (int n = 0; n < get_nodes_number(); n++)
                m_nodes[n]->clear_votes((int)rec->clsnum.size());

        while (true) { //until unclassified nodes m_class=0
                //add vote to a best node for a given class
                for (int y = 0; y < (int)rec->entries.size(); y++) {
                        if (rec->entries[y] == 0)
                                continue;

                        const float *pdata = normalize(rec->entries[y]->vec);
                        Node *pbmu_0node = get_bmu_0node(pdata);

                        //no more m_class=0 nodes
                        if (pbmu_0node == 0)
                                return;

                        //check class location in REC->clsnum[] array
                        int c = rec->entries[y]->cls;
                        for (int i = 0; i < (int)rec->clsnum.size(); i++) {
                                if (rec->clsnum[i] == c) {
                                        c = i;
                                        break;
                                }
                        }

                        pbmu_0node->add_vote(c);
                }

                //assign class from the max number of votes for a class
                for (int n = 0; n < get_nodes_number(); n++) {
                        if (m_nodes[n]->get_class() == 0)
                                m_nodes[n]->evaluate_class(&rec->clsnum[0],
                                (int)rec->clsnum.size());
                }
        }
}

With direct scheme you just browse in a single pass through every node of the SOM map and determine to which vector in the training set it is closer to. Assign the class of that vector to that node.

void SOM::assign_nodes_to(PREC rec)
{
        //run thru nodes and get best vector matching
        for (int n = 0; n < get_nodes_number(); n++) {
                m_nodes[n]->clear_votes();

                int index = 0;
                float mindist = FLT_MAX, dist;
                for (int i = 0; i < (int)rec->entries.size(); i++) {
                        if (rec->entries[i] == 0)
                                continue;
                        const float *pdata = normalize(rec->entries[i]->vec);
                        if ((dist = m_nodes[n]->get_distance
                               (pdata, m_distance_metric)) < mindist) {
                                mindist = dist;
                                index = i;
                        }
                }

                m_nodes[n]->set_class(rec->entries[index]->cls);
        }
}

Now the last function is how to proceed with classification of the new data with your trained and class assigned SOM:

const Node *SOM::classify(const float *vec);

The function returns a BMU node that 'fires' on the vector vec you present it with. You may find out what the weights of that node are, class membership, node coordinates in the SOM grid and the degree you may trust that node if you used the voting scheme:

```
inline int Node::get_class(void) const;
```

inline const float* Node::get_coords() const;

inline const float* Node::get_weights() const;

inline float Node::get_precision() const;

SOM Application for Natural Images and Faces Clustering

Now I'd like to show you how you may use it to discover patterns in your unknown data sets on real scenario for image data (but you may use any other kind of data you collected: audio, physiological (ECG, EEG, EMG), geological, other types of any dimensional data).

Suppose you are using face detection application, either from my Face Detection C++ library with Skin and Motion analysis article or any other approach. You run it at some thoroughfare, shops, airports, bus stations etc. and it produced for you some 1000, 10000 or more face samples from the people it has seen during the day or so. You will not able to manually browse them to identify how many different persons it has seen! But having some idea that at least 100 different persons may appear at that place you may arrange the SOM map consisted of at least those 100 number of nodes and cluster your 10000 samples in a few seconds onto the 2D map and immediately see all those different people.

In my face detection article, I used about 1800 face samples from several persons I collected myself. Now I may compose 1 by 20 SOM network of 20 nodes with 361 dimensional weights. I used 19x19 face rectangles and SOM nodes are 1 dimensional vectors. If you concatenate columns of 19x19 rectangle you will get 1D vector of 361 dimension. Now if you want to plot the SOM's node 361 dimensional weight vector, just take every 19 samples from it and place them back in column-wise order to 19x19 rectangle. You may proceed in that way with any dimensional data you want to cluster.

I used for the first time 1 by 20 SOM map so the found faces will be clustered on the line. Now plot the node's weights vectors as I described above. During training process as you may have learned, the weights are adjusted to be close in terms of Euclidian or any other distance to the samples from your data set.

To save the article space I plotted them in several rows, but if you take each row and concatenate it you will get 1 by 25 row of nodes. Now browse from the upper left corner towards the bottom right corner. You will see different face patterns SOM clustered during the training process.

Now to get back to the more familiar 2D SOM map composed of 5 by 5 nodes. If you have a look at the picture shown below you see how similar looking faces are placed together in clusters of a few nodes.

If we compose the SOM map consisted of more nodes, for example 10 by 10, we will be able to discover more different faces present in the dataset and plot the distance map too. But similar faces again will be placed in clusters together, so you may discover different persons easily. In the picture below the SOM consisted of 100 nodes is shown, less than 1800 face samples used for clustering.

The distance map representing the separating boundary of the clustered faces is show below for the 10 by 10 SOM map.

Though it is rather vague, you may use about 50 by 50 SOM map to obtain a more clear boundary as in RGB colors example. To save you that work I ran my computer on 50 by 50 nodes 2D SOM map to cluster the faces. It took about 15 minutes to cluster 1800 face samples.

You may conclude from the above picture that 3 completely different persons are present in the dataset. Now if you mark the 0 nodes belonging to that areas as 1st, 2nd and 3rd class, run that SOM network to classify your 1800 face samples you will find out the samples from those persons. However the 3 clusters also contain several minor clusters with boundaries denoted by the light blue color. But as the distance is not very large, compared to hot red boundary, the faces do not differ radically. The same distance map plotted in log scale shows you more clearly the rest of the clusters.

It is not possible to plot 50 by 50 weights 19x19 face vectors all at once and it needs some specialized plotting with magnification ability.

Now we may try to cluster face and non-face samples together from my training set and determine the classification accuracy too, as we know in advance which sample belongs to face or non-face class.

The next image shows the SOM clustering results for the face and non-face samples:

As the SOM map is relatively small, only 36 nodes, and it clustered about 10000 samples the classification results are rather poor. The clusters_1.som presents results for the voting scheme and clusters_2.som for the direct one for assigning the class membership for the nodes from the train set it clustered.

SOM clusters_1.som precision: 0.944137

classification results: clusters_1.som

 train set: 894 8927
   sensitivity: 44.63
   specificity: 98.67
   +predictive: 77.03
   -predictive: 94.68
      accuracy: 93.75

classification results: clusters_2.som

 train set: 894 8927
   sensitivity: 60.29
   specificity: 95.78
   +predictive: 58.84
   -predictive: 96.01
      accuracy: 92.55

The more important notion you may observe from the clusters obtained from natural images and depicted in the above figure, is that, the nodes weights vectors resemble the PCA projection basis I obtained in my face detection article. They could be used directly as a projection basis to reduce dimensionality, but it needs to consider how to convolve them properly with the data you want to project. In that way you may use the SOM as a tool to discover the linear projection basis like PCA, LDA, ICA and other methods.

Adaptive Image Segmentation

The SOMs could be applied to learn color clusters from the image. Taking some random image (e.g. a bird) and clustering the colors, the similar ones will be grouped together, and using the distance map we may identify the boundaries and segment the image to specific color regions.

I composed the SOM of 50 by 50 nodes and clustered the colors from the above picture:

Now the distance map is shown below:

As can be seen, the orange tints of the beak are grouped together and well separated by the hot red boundary on the distance map. White shades are also clustered in a single spot. The rest of the colors are also grouped by their similarity but the separating boundary is not very high as shown with light blue color on the distance map.