Contents
Introduction
The AI methods are classified into supervised and unsupervised categories. The first one is used for training the classifiers to distinguish the data when you know the classes present in your training sets and the latter is used when you are presented with unknown data and want to discover its structure. That is, you'd like to obtain the lower dimensional mapping 2D or 3D where you may easily find out how your initial high dimensional data groups together. For example, in the proverbial IRIS dataset, there are 3 categories of 4 dimensional objects and suppose you do not know in advance they are 3. If the dimensionality of the data was two or three you would have plotted them as a scatter plots and identified how many classes are there. But what if the dimensionality is 1000? You may consider self organizing maps (SOMs) as a kind of dimensionality reduction technique too. It groups similar samples from the unknown dataset on its grid together, that is, it adjusts its weight coefficients to be closer to the data samples you present it with. So it learns effectively the clusters present in the data. For IRIS set some SOM nodes would be similar to setosa, some to virgi, others to versi, and similar entries will be grouped together on the SOM's grid. If you compared the distances between adjacent SOM nodes weights, the similar ones would have closer distance between its weights and the nodes from different classes would have more distance. Having plotted the distances you will obtain the boundaries of your clusters and then just classify your unknown data samples to the discovered categories.
In this C++ lib, I introduced normalization of the data. You need to normalize it if some features have significantly increased numerical range (e.g. human height and weight. The first feature ranges about 1.50 to 2.20 meters and the second 50  150 kg. The latter prevails and your clustering results will be biased towards the weight). Apart from Euclidian distance I use sum of squared distances (SOSD), taxicab and angle between the vectors. I left unfinished Mahalanobis distance, but you may easily add the code. Just include inverse of covariance matrix for the training data you used. The data file formats that are supported in this code are described in my Backpropagation Artificial Neural Network in C++ article.
Another feature of the lib is that you may create SOM of any dimension. Typically it is 2D or 3D map, but with my code you may choose any number of dimensions for your map.
Background
You may learn about the SOM technique and the applications at the sites I used when I studied the topic: Kohonen's Self Organizing Feature Maps, SelfOrganizing Nets, and Self Organizing Map AI for Pictures. Read them first before you move forward in my article.
Using the Code
The lib implementation is produced in console application for easier usage. The help line to console is as follows:
1 ttrain
2 net.som
3 class.txt
4 epochs
5 [test set]
6 [slow] [slow=0]fast=1 training
7 [dist] distance [0euclidian],1sosd,2taxicab,3angle
8 [norm] normalize input data [0none],1minmax,2zscore,3softmax,4energy
1 rrun
2 net.som
3 class.txt
I've attached train.bat in the demo download so you may start clustering right now. For a simple example I used red, green and blue colors as 3 dimensional vectors present in rgbs.txt and rgbs1.txt files. The command line to train the SOM network is:
>som.exe t rgb.som rgbs.txt 500
Train the rgb.som network on rgbs.txt data for 500 epochs. The 6 to 8 parameters are optional and you may omit them. The 5th parameter is used only if you are going to use your network for classification of the 2 different classes. You may supply in it the test data to get performance accuracy immediately after training or as an empty file. In the latter case the random half from the training set will be used for testing. Otherwise omit the 5th parameter and use 6  8 in the same order immediately after the 4th.
>som.exe t rgb.som rgbs.txt 500 1 2 2
This is the same as the previous example but uses fast training mode (see below for description), taxicab as a distance metric and zscore data normalization.
After training 1 or 2 files will be saved depending on whether you cluster your data with unknown or known class membership. The distance map for adjacent SOM nodes will be saved to rgb_map file. If you do not know the class membership of your samples in advance, use the following entries without the class mark after the vector name:
name1
x1 x2 x3 ... xN
name2
x1 x2 x3 ... xN
...
Use the distance map file to discover the boundaries of the clusters. It will contain the same sized 2D vector as the SOM grid. The highest values correspond to the ridges separating different clusters. In that case only rgb_2.som file will be saved with zero marks for SOM node's class membership in the 2D grid. You've already defined your clusters from the distance map file. Now replace zeros in the SOM grid in rgb_2.som file for desired classes 1, 2, 3 ... N. To get more explanation on how to do it I will expound the next case if you know the class membership of your data and want to produce a multiclass classifier.
The rgbs.txt file contain 3 classes of red, green and blue colors (3D vectors). After training, 2 SOM files will be created rgb_1.som and rgb_2.som. The first one uses a voting scheme (see below for description) to attribute the SOM node's class membership from the training data and the latter uses a direct scheme for each SOM node to attribute it the class from the training set. If you open them in the text editor you will find a 2D map of 50 by 50 entries with numbers 1, 2 and 3 grouped together. This is the clustering of your data. Similar vectors are grouped together. Now plot rgb_map
in Matlab with imshow(rgb_map, [])
and use colormap('jet')
.
You see the 3 separated blue regions which are the 3 clusters found in your data. Your 3 classes in the 2D map of 50 by 50 entries (rgb_1.som and rgb_2.som) match those regions perfectly. Now if you had only rgb_2.som file with unknown classes (0 in the 50 by 50 2D map), you would have replaced the zeros as I described above with 3 different classes you found in rgb_map
plotting and get your classifier.
The nodes of the SOM contain 3 dimensional weights vectors which are adjusted to resemble the training data. Having the advantage that we used RGB color data, we may just arrange 50x50x3 RGB color data and plot it.
Now you may see that the weights of each node are similar to the data from rgbs.txt file.
In the fast mode training I do not alter weights of the nodes which exceed the bounds of radius from the best matched unit (BMU) node, whereas in the previous example I modify the weights of all the nodes with exponentially decreased learning rule. The fast mode training however provides sharper boundaries:
And the 50x50x3 RGB color data is shown below:
Now you may choose random RGB color entries and try to cluster them. In the next picture, the 50x50x3 SOM map is shown obtained with clustering 2000 random RGB vectors:
Similar colors are clustered together.
The Library
Next I provide a more detailed description of the classes in the project.
The C++ SOM lib contains 2 classes:
The Node
is the unit which constitutes the SOM map. It is initialized upon construction with specific weights
vector of length weights_number
, its coordinates in the N dimensional SOM map coords
vector with coords_number
coordinates (x, y, z, ... ) and optional class membership class_
:

Node::Node(const float *weights, int weights_number,
const float *coords, int coords_number, int class_ = 0);
For 2D SOM map each node has two coordinates (x
,y
) for 3D map (x
,y
,z
) and so on, but you will not be able to visualize 4 and more dimensional SOMs unless you use them as classifiers only. The weights_number
is equal to the dimensionality of the data you use to cluster or classify. For RGB color triplet the data vector is 3 dimensional.
To determine the distance from the node's weight vector to specific data vector from the dataset, the get_distance()
function is defined:
The distance_metric
is one of the metrics used to compute the distance: enum DistanceMetric {EUCL, SOSD, TXCB, ANGL, MHLN};
Euclidian, sum of squared distances, taxicab, angle or mahalanobis. The latter is reserved for future code development (before training you will need to estimate inverse of covariance matrix of your train set which will be passed in the optional parameter cov
).
inline float Node::get_distance
(const float *vec, enum DistanceMetric distance_metric, const float **cov) const
{
float distance = 0.0f;
float n1 = 0.0f, n2 = 0.0f;
switch (distance_metric) {
default:
case EUCL: if (m_weights_number >= 4) {
distance = mse(vec, m_weights, m_weights_number);
} else {
for (int w = 0; w < m_weights_number; w++)
distance += (vec[w]  m_weights[w]) *
(vec[w]  m_weights[w]);
}
return sqrt(distance);
case SOSD: if (m_weights_number >= 4) {
distance = mse(vec, m_weights, m_weights_number);
} else {
for (int w = 0; w < m_weights_number; w++)
distance += (vec[w]  m_weights[w]) *
(vec[w]  m_weights[w]);
}
return distance;
case TXCB: for (int w = 0; w < m_weights_number; w++)
distance += fabs(vec[w]  m_weights[w]);
return distance;
case ANGL: for (int w = 0; w < m_weights_number; w++) {
distance += vec[w] * m_weights[w];
n1 += vec[w] * vec[w];
n2 += m_weights[w] * m_weights[w];
}
return acos(distance / (sqrt(n1)*sqrt(n2)));
}
}
The mse()
is just the SSE optimized function returning mean squared error between 2 vectors.
To adjust the node's weight vector to specific data vector from your train set, the train()
function is used:
inline void Node::train(const float *vec, float learning_rule)
{
for (int w = 0; w < m_weights_number; w++)
m_weights[w] += learning_rule * (vec[w]  m_weights[w]);
}
learning_rule
allows you to control the amount the weights are changed. It starts from about 0.9 and exponentially decreases during training process as you may have learned from the articles I suggested in the background section.
To create or load an SOM map from the file, there are 2 constructors available:

SOM::SOM(int dimensionality, int *dimensions, int weights_per_node,
enum Node::DistanceMetric distance_metric, float *add = 0,
float *mul = 0);

SOM::SOM(const wchar_t *fname);
With the first one you may create an N dimensional SOM map with number of dimensions in the grid as dimensionality
, specifying the size of each dimension in dimensions
. The weights_per_node
is the dimensionality of your training data samples, distance_metric
is self evident in its name and add
, mul
are optional normalization vectors. Each data vector fed to SOM is normalized as x(i) = (x(i) + add(i)) * mul(i)
. By default add
is filled with zeros and mul
is filled with ones, so no normalization occurs. The node's weights will be initialized with random numbers.
int dimensions[] = {50, 50};
SOM rgb_som(2, dimensions, 3, Node::EUCL);
int dimensions[] = {10, 10, 10, 10}
SOM rgb_som(4, dimensions, 10, Node::ANGL);
You may also load the SOM network from the file with the second constructor. You may use either a reduced file format, specifying only SOM dimensionality, dimension sizes and weights per node as in the rgb.som file in the demo zip file or a full file format with actual weights coefficients. The reduced file format allows to produce random weights values before clustering:
>rgb.som file, 2D SOM 50 by 50 with 3 weights per node
2
50 50
3
The full SOM file format is shown on the example of 2D SOM map 5 by 5 used to cluster rgbs.txt data:
2
5 5
3
Eucl
2 2 2 3 3
2 2 2 3 3
1 1 1 3 3
1 1 1 3 3
1 1 1 1 3
None
0 1
0 1
0 1
0 0
0.00398814
0.987988
0.0100357
1 0
0.01
0.94
0.03
2 0
0.00902465
0.360081
0.618271
3 0
0.01
0.02
0.97
4 0
0.0133332
0.00336507
0.996667
0 1
0.0198603
0.956393
0.0118624
1 1
3.37824e013
0.91
0.01
2 1
0.0197248
0.368034
0.581373
3 1
0.0191355
0.0162165
0.950495
4 1
0.0150343
0.00503427
0.975034
0 2
0.753054
0.228725
0.0101103
1 2
0.643217
0.28021
0.046389
2 2
0.222105
0.0906172
0.651532
3 2
0.03
0.01
0.92
4 2
0.03
0.01
0.94
0 3
0.975034
0.0150343
0.00503427
1 3
0.94
0.03
0.01
2 3
0.742754
0.0188995
0.198852
3 3
0.152863
0.0109908
0.787188
4 3
2.79205e013
0.01
0.91
0 4
0.996667
0.0133332
0.00336507
1 4
0.97
0.01
0.02
2 4
0.915034
0.0151028
0.01
3 4
0.65243
0.014252
0.276088
4 4
0.0573598
0.010243
0.871152
Eucl
means Euclidian distance for calculating BMU node. Other possible distances strings are defined in SOM::g_distance[5][5]
. Next the 5 by 5 nodes class membership marks presented. In that case 1 is the red color vector, 2  green and 3  blue. In case you do not know in advance the class membership of your unknown data, there will be 0's instead, meaning unknown class membership node. None
is the normalization of the data fed to SOM, with first column as the add
vector and second as the mul
vector, see above for normalization formula. The normalization strings are defined at SOM::g_normalization[5][5]
as no normalization, minmax, zscore, sigmoid and energy. Next are the coordinates of the node in the SOM grid and its weight vector. So (0,0) is the node at x = 0 and y = 0 position in the SOM 2D grid and so on.
After the constructor calls int SOM::status(void)
to determine if you successfully loaded the file, negative value indicates the error, zero  success and positive means random weights.
You may specify externally the distance metric, training mode and normalization parameters with those functions:

inline void SOM::set_distance_metric(enum Node::DistanceMetric distance_metric);

inline void SOM::set_train_mode(enum TrainMode train_mode);

void SOM::compute_normalization(PREC rec, enum Normalization norm);
The PREC
is the data structure where you load your training data with read_class()
function defined in the console main body, see stdafx.h for description.
After you created the SOM map you may start clustering the data.
The train()
function runs one epoch of training session. It uses your data vectors stored in vectors
std array and adjusts its weights within R
radius from the BMU nodes using learning_rule
. I use exponential decrease for R
and learning_rule
this way in my console code:
float R, R0 = som>R0();
float nrule, nrule0 = 0.9f;
int epochs = _wtoi(argv[4]);
float N = (float)epochs;
int x = 0;
while (epochs) {
R = R0 * exp(10.0f * (x * x) / (N * N)); nrule = nrule0 * exp(10.0f * (x * x) / (N * N));
x++;
som>train(&train_vectors, R, nrule);
wprintf(L" epoch: %d R: %.2f nrule: %g \n", (epochs  1), R, nrule);
if (kbhit() && _getwch() == 'q') epochs = 0;
}
The SOM::R0()
returns half the distance from the longest dimension size of the SOM map. For 20 by 50 SOM map the initial radius equals 50 / 2 = 25 within which the neighbouring nodes from the BMU node will be updated.
float SOM::R0() const
{
float R = 0.0f;
for (int i = 0; i < m_dimensionality; i++)
if ((float)m_dimensions[i] > R)
R = (float)m_dimensions[i];
return R / 2.0f;
}
The training function is shown below:
void SOM::train(const vector<float *> *vectors, float R, float learning_rule)
{
for (int n = 0; n < (int)vectors>size(); n++) {
const float *pdata = normalize(vectors>at(n));
Node *bmu_node = get_bmu_node(pdata);
const float *p1 = bmu_node>get_coords();
if (R <= 1.0f) bmu_node>train(pdata, learning_rule);
else {
for (int i = 0; i < get_nodes_number(); i++) {
const float *p2 = m_nodes[i]>get_coords();
float dist = 0.0f;
for (int p = 0; p < m_dimensionality; p++)
dist += (p1[p]  p2[p]) * (p1[p]  p2[p]);
dist = sqrt(dist);
if (m_train_mode == FAST && dist > R)
continue;
float y = exp((1.0f * dist * dist) / (R * R));
m_nodes[i]>train(pdata, learning_rule * y);
}
}
}
}
Now you may have a look at what is the SLOW
and FAST
training modes are. If the selected node's distance to the BMU node exceeds current R
, then the selected node does not adjust its weights towards input train vector in FAST
training mode. And when R
falls below 1.0f, the fine tuning phase is implemented when only BMU nodes weights are adjusted.
The normalize()
function just normalizes input data vectors with specified normalization type:
const float* SOM::normalize(const float *vec)
{
switch (m_normalization) {
default:
case NONE:
for (int x = 0; x < get_weights_per_node(); x++)
m_data[x] = vec[x];
break;
case MNMX:
for (int x = 0; x < get_weights_per_node(); x++)
m_data[x] = (0.9f  0.1f) *
(vec[x] + m_add[x]) * m_mul[x] + 0.1f;
break;
case ZSCR:
for (int x = 0; x < get_weights_per_node(); x++)
m_data[x] = (vec[x] + m_add[x]) * m_mul[x];
break;
case SIGM:
for (int x = 0; x < get_weights_per_node(); x++)
m_data[x] = 1.0f / (1.0f +
exp(((vec[x] + m_add[x]) * m_mul[x])));
break;
case ENRG:
float energy = 0.0f;
for (int x = 0; x < get_weights_per_node(); x++)
energy += vec[x] * vec[x];
energy = sqrt(energy);
for (int x = 0; x < get_weights_per_node(); x++)
m_data[x] = vec[x] / energy;
break;
}
return m_data;
}
The normalization parameters you compute beforehand with compute_normalization()
function select one of the available normalization methods: enum Normalization {NONE, MNMX, ZSCR, SIGM, ENRG};
If you are just clustering unknown data vectors then you may proceed to save the SOM network and the distance map:
The distance map is currently supported for 2D SOM maps only, as for higher dimensional ones you'd rather be not able to visualize it. To create the distance map the MSE is computed for the weights of adjacent nodes. If the nodes belong to the same cluster, their weights will not differ too much. On the contrary, for the adjacent nodes on the clusters boundaries the MSE of their weights will have higher values:
int SOM::save_2D_distance_map(const wchar_t *fname) const
{
int D = 2;
float min_dist = 1.5f;
if (get_dimensionality() != D)
return 1;
FILE *fp = _wfopen(fname, L"wt");
if (fp != 0) {
int n = 0;
for (int i = 0; i < m_dimensions[0]; i++) {
for (int j = 0; j < m_dimensions[1]; j++) {
float dist = 0.0f;
int nodes_number = 0;
const Node *pnode = get_node(n++);
for (int m = 0; m < get_nodes_number(); m++) {
const Node *node = get_node(m);
if (node == pnode)
continue;
float tmp = 0.0;
for (int x = 0; x < D; x++)
tmp += pow(*(pnode>get_coords() + x) 
*(node>get_coords() + x), 2.0f);
tmp = sqrt(tmp);
if (tmp <= min_dist) {
nodes_number++;
dist += pnode>get_distance
(node>m_weights, m_distance_metric);
}
}
dist /= (float)nodes_number;
fwprintf(fp, L" %f", dist);
}
fwprintf(fp, L"\n");
}
fclose(fp);
return 0;
}
else
return 2;
}
Now if you want to cluster your data with known class membership and find out if the data groups together according to the membership, you need to assign the classes to the SOM's nodes after the clustering process finished. You may use either the voting scheme or the direct class assignment scheme.
With the voting scheme you present the vectors from your training set to the SOM. On each vector only one BMU node will be closest to it. You increase the vote count for that vector's class on that BMU node. After all training vectors have been presented to the SOM, by majority of the votes for a particular class on the single node you assign the class membership to that node. Suppose the 3 classes are present in your data set and some node 'fired' 3 times for 1 class, 20 times for 2nd and 5 times for 3rd. By majority vote, you assign it to the 2nd class. The degree you may trust that node during classification is 20 / (3 + 20 + 5) = 0.71 out of 1.0.
void SOM::vote_nodes_from(PREC rec)
{
for (int n = 0; n < get_nodes_number(); n++)
m_nodes[n]>clear_votes((int)rec>clsnum.size());
while (true) { for (int y = 0; y < (int)rec>entries.size(); y++) {
if (rec>entries[y] == 0)
continue;
const float *pdata = normalize(rec>entries[y]>vec);
Node *pbmu_0node = get_bmu_0node(pdata);
if (pbmu_0node == 0)
return;
int c = rec>entries[y]>cls;
for (int i = 0; i < (int)rec>clsnum.size(); i++) {
if (rec>clsnum[i] == c) {
c = i;
break;
}
}
pbmu_0node>add_vote(c);
}
for (int n = 0; n < get_nodes_number(); n++) {
if (m_nodes[n]>get_class() == 0)
m_nodes[n]>evaluate_class(&rec>clsnum[0],
(int)rec>clsnum.size());
}
}
}
With direct scheme you just browse in a single pass through every node of the SOM map and determine to which vector in the training set it is closer to. Assign the class of that vector to that node.
void SOM::assign_nodes_to(PREC rec)
{
for (int n = 0; n < get_nodes_number(); n++) {
m_nodes[n]>clear_votes();
int index = 0;
float mindist = FLT_MAX, dist;
for (int i = 0; i < (int)rec>entries.size(); i++) {
if (rec>entries[i] == 0)
continue;
const float *pdata = normalize(rec>entries[i]>vec);
if ((dist = m_nodes[n]>get_distance
(pdata, m_distance_metric)) < mindist) {
mindist = dist;
index = i;
}
}
m_nodes[n]>set_class(rec>entries[index]>cls);
}
}
Now the last function is how to proceed with classification of the new data with your trained and class assigned SOM:
The function returns a BMU node that 'fires' on the vector vec
you present it with. You may find out what the weights of that node are, class membership, node coordinates in the SOM grid and the degree you may trust that node if you used the voting scheme:

inline int Node::get_class(void) const;

inline const float* Node::get_coords() const;

inline const float* Node::get_weights() const;

inline float Node::get_precision() const;
SOM Application for Natural Images and Faces Clustering
Now I'd like to show you how you may use it to discover patterns in your unknown data sets on real scenario for image data (but you may use any other kind of data you collected: audio, physiological (ECG, EEG, EMG), geological, other types of any dimensional data).
Suppose you are using face detection application, either from my Face Detection C++ library with Skin and Motion analysis article or any other approach. You run it at some thoroughfare, shops, airports, bus stations etc. and it produced for you some 1000, 10000 or more face samples from the people it has seen during the day or so. You will not able to manually browse them to identify how many different persons it has seen! But having some idea that at least 100 different persons may appear at that place you may arrange the SOM map consisted of at least those 100 number of nodes and cluster your 10000 samples in a few seconds onto the 2D map and immediately see all those different people.
In my face detection article, I used about 1800 face samples from several persons I collected myself. Now I may compose 1 by 20 SOM network of 20 nodes with 361 dimensional weights. I used 19x19 face rectangles and SOM nodes are 1 dimensional vectors. If you concatenate columns of 19x19 rectangle you will get 1D vector of 361 dimension. Now if you want to plot the SOM's node 361 dimensional weight vector, just take every 19 samples from it and place them back in columnwise order to 19x19 rectangle. You may proceed in that way with any dimensional data you want to cluster.
I used for the first time 1 by 20 SOM map so the found faces will be clustered on the line. Now plot the node's weights vectors as I described above. During training process as you may have learned, the weights are adjusted to be close in terms of Euclidian or any other distance to the samples from your data set.
To save the article space I plotted them in several rows, but if you take each row and concatenate it you will get 1 by 25 row of nodes. Now browse from the upper left corner towards the bottom right corner. You will see different face patterns SOM clustered during the training process.
Now to get back to the more familiar 2D SOM map composed of 5 by 5 nodes. If you have a look at the picture shown below you see how similar looking faces are placed together in clusters of a few nodes.
If we compose the SOM map consisted of more nodes, for example 10 by 10, we will be able to discover more different faces present in the dataset and plot the distance map too. But similar faces again will be placed in clusters together, so you may discover different persons easily. In the picture below the SOM consisted of 100 nodes is shown, less than 1800 face samples used for clustering.
The distance map representing the separating boundary of the clustered faces is show below for the 10 by 10 SOM map.
Though it is rather vague, you may use about 50 by 50 SOM map to obtain a more clear boundary as in RGB colors example. To save you that work I ran my computer on 50 by 50 nodes 2D SOM map to cluster the faces. It took about 15 minutes to cluster 1800 face samples.
You may conclude from the above picture that 3 completely different persons are present in the dataset. Now if you mark the 0 nodes belonging to that areas as 1st, 2nd and 3rd class, run that SOM network to classify your 1800 face samples you will find out the samples from those persons. However the 3 clusters also contain several minor clusters with boundaries denoted by the light blue color. But as the distance is not very large, compared to hot red boundary, the faces do not differ radically. The same distance map plotted in log scale shows you more clearly the rest of the clusters.
It is not possible to plot 50 by 50 weights 19x19 face vectors all at once and it needs some specialized plotting with magnification ability.
Now we may try to cluster face and nonface samples together from my training set and determine the classification accuracy too, as we know in advance which sample belongs to face or nonface class.
The next image shows the SOM clustering results for the face and nonface samples:
As the SOM map is relatively small, only 36 nodes, and it clustered about 10000 samples the classification results are rather poor. The clusters_1.som presents results for the voting scheme and clusters_2.som for the direct one for assigning the class membership for the nodes from the train set it clustered.
SOM clusters_1.som precision: 0.944137
classification results: clusters_1.som
train set: 894 8927
sensitivity: 44.63
specificity: 98.67
+predictive: 77.03
predictive: 94.68
accuracy: 93.75
classification results: clusters_2.som
train set: 894 8927
sensitivity: 60.29
specificity: 95.78
+predictive: 58.84
predictive: 96.01
accuracy: 92.55
The more important notion you may observe from the clusters obtained from natural images and depicted in the above figure, is that, the nodes weights vectors resemble the PCA projection basis I obtained in my face detection article. They could be used directly as a projection basis to reduce dimensionality, but it needs to consider how to convolve them properly with the data you want to project. In that way you may use the SOM as a tool to discover the linear projection basis like PCA, LDA, ICA and other methods.
Adaptive Image Segmentation
The SOMs could be applied to learn color clusters from the image. Taking some random image (e.g. a bird) and clustering the colors, the similar ones will be grouped together, and using the distance map we may identify the boundaries and segment the image to specific color regions.
I composed the SOM of 50 by 50 nodes and clustered the colors from the above picture:
Now the distance map is shown below:
As can be seen, the orange tints of the beak are grouped together and well separated by the hot red boundary on the distance map. White shades are also clustered in a single spot. The rest of the colors are also grouped by their similarity but the separating boundary is not very high as shown with light blue color on the distance map.