15,945,119 members
Articles / Artificial Intelligence / Neural Networks
Article

# Artificial Neural Network C++ Class

Rate me:
23 Apr 2024CPOL7 min read 114.8K   6.8K   118   62
Artificial Neural Network C++ class with two use cases: Counter and Handwritten Digits recognition
This article provides a simple C++ class without any complications in the mathematical calculations of the backpropagation algorithm. Two use cases have been provided to facilitate code usage.

## Background

This article is not to explain the scientific side of (ANN) Artificial Neural Networks. It provides a simple C++ class without any complications in the mathematical calculations of the backpropagation algorithm. If you have good experience about ANN, you can skip to the next section, else, you can revise this very good resources about ANN. I have provided two use cases to facilitate code usage as much as possible.

## Introduction

Today, (ANN) Artificial neural networks has become dominant in many areas of life, whether an industry or at home. ANN enables machines to learn and to simulate human brain to recognize patterns, and make predictions as well as solve problems in every business sector. Smartphones and computers that we use on a daily basis are using ANN in some of its applications. For example, Finger Print and Face unlock services in smartphones and computers use ANN. Handwritten Signature Verification uses ANN. I have written a simple implementation for an Artificial Neural Network C++ class that handles backpropagation algorithm. The code depends on Eigen open-source templates to handle Matrices’ mathematics. I made code simple and fast as possible.

## NeuralNetwork Class

`NeuralNetwork` is a simple C++ class with the following structure:

The code uses `RowVectorXd` and `MatrixXd` from `Eigen` template library. The main functions `"train"` and `"test"` take input and desired output in `RowVector` format. Both of them call `"forward"` function which uses vector multiplication.

### forward

C++
```void NeuralNetwork::forward(RowVector& input) {
// set first layer input
mNeurons.front()->block(0, 0, 1, input.size()) = input;

// propagate forward (vector multiplication)
for (unsigned int i = 1; i < mArchitecture.size(); i++) {
// copy values ingoring last neuron as it is a bias
mNeurons[i]->block(0, 0, 1, mArchitecture[i]) =
(*mNeurons[i - 1] * *mWeights[i - 1]).block(0, 0, 1, mArchitecture[i]);
for (int col = 0; col < mArchitecture[i]; col++)
mNeurons[i]->coeffRef(col) = activation(mNeurons[i]->coeffRef(col));
}
}```

The function propagates with input through network layers to get output from the last layer. Each neuron in the hidden layer first computes a weighted sum of its inputs. Then it applies an activation function (segmoid) to this sum to derive its output. This function affects neurons values only. It doesn't affect connections weights or errors. This function does this sum with vector multiplication:

C++
`(*mNeurons[i - 1] * *mWeights[i - 1])`

Then, resultant values are passed through `activation` function.

C++
```double NeuralNetwork::activation(double x) {
if (mActivation == TANH)
return tanh(x);
if (mActivation == SIGMOID)
return 1.0 / (1.0 + exp(-x));
return 0;
}```

### backward

C++
```void NeuralNetwork::backward(RowVector& output) {
// calculate last layer errors
*mErrors.back() = output - *mNeurons.back();

// calculate hidden layers' errors (vector multiplication)
for (size_t i = mErrors.size() - 2; i > 0; i--)
*mErrors[i] = *mErrors[i + 1] * mWeights[i]->transpose();

// update weights
size_t size = mWeights.size();
for (size_t i = 0; i < size; i++)
for (int col = 0, cols = (int)mWeights[i]->cols(); col < cols; col++)
for (int row = 0; row < mWeights[i]->rows(); row++) {
mWeights[i]->coeffRef(row, col) +=
mLearningRate *
mErrors[i + 1]->coeffRef(col) *
activationDerivative(mNeurons[i + 1]->coeffRef(col)) *
mNeurons[i]->coeffRef(row);
}
}```

The function is key of the Backpropagation algorithm. It takes output of last layer and propagates backward through network layers, calculates each layer errors, and update connections weights depending on the rule:
`new weight = old weight + learingRate * next error * sigmoidDerivative(next neuron value)`

C++
```double NeuralNetwork::activationDerivative(double x) {
if (mActivation == TANH)
return 1 - tanh(x) * tanh(x);
if (mActivation == SIGMOID)
return x * (1.0 - x);
return 0;
}```

#### sigmoid derivative

Note: The curve of `sigmoidDerivative` has a big significance. As its input ranges from 0 to 1 (neuron value), there are the three possible cases:

1. neuron value near `0`, so weight value doesn't need support.

2. neuron value near `0.5`, so weight value needs a slight change.

3. neuron value near `1`, so weight value doesn't need support.

### train

C++
```void NeuralNetwork::train(RowVector& input, RowVector& output) {
forward(input);
backward(output);
}```

The function propagates input in forward direction, then propagates backward with the resultant output to adjust connections weight.

### test

C++
```void NeuralNetwork::test(RowVector& input, RowVector& output) {
forward(input);
// calculate last layer errors
*mErrors.back() = output - *mNeurons.back();
}```

The function propagates input in forward direction, then calculates error between resultant output and desired output.

### evaluate

There are various ways to evaluate the performance of neural network model, such as Confusion matrix, Accuracy, Precision, Recall, and F1 score. I have added “Confusion Matrix” calculation to the code through the evaluate function call after each testing call.

C++
```void NeuralNetwork::evaluate(RowVector& output) {
double desired = 0, actual = 0;
mConfusion->coeffRef(
vote(output, desired),
vote(*mNeurons.back(), actual)
)++;
}```

This function simply fill the right cell in the confusion matrix depending on the match between the actual and desired output.

After the hole testing the confusion matrix can be used to calculate Precision, Recall, and F1 score.

C++
```void NeuralNetwork::confusionMatrix(RowVector*& precision, RowVector*& recall) {
int rows = (int)mConfusion->rows();
int cols = (int)mConfusion->cols();

precision = new RowVector(cols);
for (int col = 0; col < cols; col++) {
double colSum = 0;
for (int row = 0; row < rows; row++)
colSum += mConfusion->coeffRef(row, col);
precision->coeffRef(col) = mConfusion->coeffRef(col, col) / colSum;
}

recall = new RowVector(rows);
for (int row = 0; row < rows; row++) {
double rowSum = 0;
for (int col = 0; col < cols; col++)
rowSum += mConfusion->coeffRef(row, col);
recall->coeffRef(row) = mConfusion->coeffRef(row, row) / rowSum;
}

...
}```

This calclation will be clear in the second Usecase Handwritten Digits Recognition

## Use Cases

### Simple Counter

Neural network takes an input in binary (3 bits) and generates an output equals to input + 1. Then output is taken back as an input to the network. If input number equals to 7 (111 in binary) output should be 0. Network is trained using backpropagation algorithm to adjust network's connections weights. Training process takes about 2 minutes to minimize error between desired output and actual network output.

 Input Output 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1 1 1 0 1 1 0 1 1 1 1 1 1 0 0 0

Simply, construct `NeuralNetwork` class with the required architecture and `learningRate`.

C++
`NeuralNetwork net({ 3, 5, 3 }, 0.05, NeuralNetwork::Activation::TANH);`

3 neurons in input layer, 5 neurons in hidden layer, and 3 neurons in output layer.
0.05 learning rate.

The following figure describes full training process for the network with 50,000 trials.

#### Train Network

C++
```void train(NeuralNetwork& net) {
cout << "Training:" << endl;
RowVector input(3), output(3);
int stop = 0;
for (int i = 0; stop < 8 && i < 50000; i++) {
cout << i + 1 << endl;
for (int num = 0; stop < 8 && num < 8; num++) {
input.coeffRef(0) = (num >> 2) & 1;
input.coeffRef(1) = (num >> 1) & 1;
input.coeffRef(2) = num & 1;

output.coeffRef(0) = ((num + 1) >> 2) & 1;
output.coeffRef(1) = ((num + 1) >> 1) & 1;
output.coeffRef(2) = (num + 1) & 1;

net.train(input, output);
double mse = net.mse();
cout << "In [" << input << "] "
<< " Desired [" << output << "] "
<< " Out [" << net.mNeurons.back()->unaryExpr(ptr_fun(unary)) << "] "
<< " MSE [" << mse << "]" << endl;
stop = mse < 0.1 ? stop + 1 : 0;
}
}
}```

The function takes a network with an architecture { 3, 5, 3 } and does 50000x8 training call till it reaches acceptable error margin. After each training call, it displays input, output, and desired output.

1. In the first stages of training, the MSE (mean square error) is large, and output is so far from desired output.

2. After many rounds of training, the MSE decreased, and output came closer to desired output.

3. Finally, after 788 rounds, the MSE became less than 0.1 and the output was close to desired output.

#### Test Network

C++
```void test(NeuralNetwork& net) {
cout << "Testing:" << endl;

RowVector input(3), output(3);
for (int num = 0; num < 8; num++) {
input.coeffRef(0) = (num >> 2) & 1;
input.coeffRef(1) = (num >> 1) & 1;
input.coeffRef(2) = num & 1;

output.coeffRef(0) = ((num + 1) >> 2) & 1;
output.coeffRef(1) = ((num + 1) >> 1) & 1;
output.coeffRef(2) = (num + 1) & 1;

net.test(input, output);

double mse = net.mse();
cout << "In [" << input << "] "
<< " Desired [" << output << "] "
<< " Out [" << net.mNeurons.back()->unaryExpr(ptr_fun(unary)) << "] "
<< " MSE [" << mse << "]" << endl;
}
}```

This function tests some inputs with the pre-trained network. It prints resultant output and MSE.

#### Save Network

C++
```int main() {
NeuralNetwork net({ 3, 5, 3 }, 0.05);
RowVector input(3), output(3);

train(net, input, output);
test(net, input, output);
net.save("params.txt"); // Save architecture and weights

return 0;
}```

After training and testing network, we can save network structure in a file to be loaded later for network usage without retraining.

For our case, resultant file contains:

```learningRate: 0.05
architecture: 3,5,3
activation: 0
weights:
-1.34013   0.811848   0.314629    1.85447  -0.343212   0.151176
0.98971  -0.684254    1.20649   0.260128   -6.50245   -2.31706
0.702027   -3.15824   -0.80735    1.07841   -2.57619   -2.17761
0.13025    3.17894   0.594173   -3.18092 -0.0574412   -2.39394,
-2.67379  0.467493  0.403606
-1.22918   1.67581   1.60877
1.1605  -1.95284  0.942444
-1.92978 -0.704029  -1.12284
-1.34765   -2.8206   1.44205
-0.996246  -1.52939  0.205469```

The first line in weights section represents weights between first neuron in input layer and all neurons of next layer:

`-1.34013   0.811848   0.314629    1.85447  -0.343212   0.151176`

The second line in weights section represents weights between second neuron in input layer and all neurons of next layer:

`0.98971  -0.684254    1.20649   0.260128   -6.50245   -2.31706`

and, so on ...

### Handwritten Digits Recognition

Handwritten recognition is one of the most successful application for Artificial Neural Network. It is the "`Hello world`" application for Neural Network study. In the previous use case, I use a shallow neural network, which has three layers of neurons that process inputs and generate outputs. Shallow neural networks can handle equally complex problems. But, in Handwritten Recognition, we need more accuracy and nonlinearity. Therefore, I have to use Deep Neural Network (DNN). DNN has two or more hidden layers of neurons that process inputs.

#### Network Architecture

Using a network architecture {784, 64, 16, 10} (input - two hidden layers - output), I have achieved a success of 93.16%.

#### Activity Diagram

The following figure illustrates activity diagram of the whole process.

#### Used Libraries

This project uses:

• MNIST dataset for network training and testing. You have to download MNIST dataset files and put them in project execution path.
• libpng library for PNG files reading. You can download libpng16 (lib - h) files and put it in project build path.
• zlib library used internally by libpng16 to decompress images.

MNIST dataset contains 60,000 training images of handwritten digits from zero to nine and 10,000 images for testing. So, the MNIST dataset has 10 different classes. The handwritten digits images are represented as a 28×28 matrix where each cell contains grayscale pixel value (0 to 1).

#### Training and Testing

During training and testing, digit is read from its PNG file and converted from 28x28 image to a 784 double value of gray scale. This vector represented the input to the input layer of the neural network.

C++
```void readPng(const char* filepath, RowVector*& data) {
pngwriter image;
int width = image.getwidth(); // 28
int height = image.getheight(); // 28
data = new RowVector(width * height); // 784

for (int y = 0; y < height; y++)
for (int x = 0; x < width; x++)
data->coeffRef(0, y * width + x) = image.dread(x, y);
}```

The following figure describes full training and testing processes for the network with 60,000 images (50,000 training - 10,000 testing).

1. In the first stages of training, error is large and the output is so far from the desired output.

2. After many rounds of training, the MSE decreased and the output came closer to the desired output.

3. After testing 10000 images:

4. Display Training and Testing Cost and error percentage:

#### Save Network

C++
```int main() {
.......
if (!testOnly)
net.save("params.txt");

return 0;
}  ```

After training and testing network, we can save network structure in a file to be loaded later for network usage without retraining. If you are going to retrain, you have to delete the file "params.txt" from build path.
For our case resultant file contains:

```learningRate: 0.05
architecture: 784,64,16,10
activation: 1
weights:
-0.997497    -0.307718   -0.0558184     0.124485    -0.188635     0.557909     0.242286
-0.898618    -0.942442     0.355693     0.284951     0.100192     0.724357    -0.998474
0.763909    -0.127537     0.893246    -0.956969    -0.492111    -0.775506    -0.603442
-0.907712    -0.987793   -0.0556963    -0.510117     0.450484     0.644276     0.951292
0.105869     -0.76458     0.586596     0.480819     0.253029    -0.672964    -0.418134
0.117222     0.121494     0.439985    -0.459639    -0.514145     0.458296     0.639027
-0.926817    -0.581164     0.774529    -0.392315    -0.985656     0.405133   -0.0527665
-0.0163884  -0.00704978     0.138768      -0.2219    -0.927671    -0.880856     0.977355
-0.927854     0.253273    -0.154149    -0.877621     0.797845     0.388653    0.0682699
0.3361    -0.108066
0.127171    -0.962889      0.39848    -0.457381     0.470931    -0.574816    -0.820429
-0.851558    -0.925108     0.224769     0.575488     0.975402    -0.688955      0.78692
0.0274972    -0.218848    -0.790765     0.708121     0.144139    -0.574694     0.749809
0.781732     0.362285    -0.662099    -0.903134     0.375225     0.581286    -0.679678
0.0863369     0.295511    -0.418195     0.241249    -0.720573    -0.794733    0.0434278
-0.81109     0.895749     0.652699     0.970824     0.643422   -0.0625935     0.776421
-0.656117      0.23075     -0.18247    -0.250649    -0.197546     0.621632     0.804376
-0.976745     0.178747     0.137059    -0.404828    -0.564013    -0.309915    -0.376385
-0.66924     0.245216      -0.3961     0.160741     0.364788     0.150121    -0.811396
-0.837397    -0.901669
....```

#### Evaluation

After testing the network we can calculate evaluation items Precision, Recall, and F1 score from the Confusion Matrix.

Precision is the ratio between correct recognition (true positive) to predicted digit.

C++
`Precision = (0.95+0.97+0.95+0.95+0.92+0.93+0.96+0.95+0.94+0.89)/10 = 94%`

Recall is the ratio between correct recognition (true positive) to actual digit.

C++
`Recall = (0.98+0.98+0.93+0.93+0.92+0.94+0.93+0.94+0.92+0.92)/10 = 94%`
C++
```void evaluate(NeuralNetwork& net) {
RowVector* precision, * recall;
net.confusionMatrix(precision, recall);

double precisionVal = precision->sum() / precision->cols();
double recallVal = recall->sum() / recall->cols();
double f1score = 2 * precisionVal * recallVal / (precisionVal + recallVal);

cout << "Confusion matrix:" << endl;
cout << *net.mConfusion << endl;
cout << "Precision: " << (int)(precisionVal * 100) << '%' << endl;
cout << *precision << endl;
cout << "Recall: " << (int)(recallVal * 100) << '%' << endl;
cout << *recall << endl;
cout << "F1 score: " << (int)(f1score * 100) << '%' << endl;
delete precision;
delete recall;
}```

The resultant values are like that:

```Confusion matrix:
98.6735   0.102041          0   0.102041          0   0.204082   0.306122   0.306122   0.306122          0
5.659e-313    98.2379   0.264317   0.176211          0          0   0.264317   0.176211   0.792952  0.0881057
1.06589   0.290698    93.5078   0.387597    1.45349   0.290698   0.484496    1.16279   0.968992   0.387597
0    0.29703    1.18812    93.5644  0.0990099    1.48515    0.29703   0.990099   0.891089    1.18812
0.101833          0   0.407332   0.101833    92.9735          0   0.916497   0.203666          0    5.29532
0.44843   0.112108   0.112108    2.01794   0.336323    94.2825    0.44843   0.560538    1.00897   0.672646
1.46138   0.313152   0.417537          0    1.04384    2.71399    93.6326   0.104384   0.313152          0
0    1.07004    1.16732   0.194553   0.583658   0.194553          0    94.5525  0.0972763    2.14008
0.616016   0.513347   0.616016    1.12936    1.12936   0.718686   0.821355   0.821355    92.4025    1.23203
0.99108   0.396432          0   0.891972    3.07235   0.396432   0.099108   0.693756    0.49554    92.9633
Precision: 94%
0.95459 0.972949 0.958292 0.951662 0.922222 0.934444 0.961415 0.951076 0.948367 0.895893
Recall: 94%
0.986735 0.982379 0.935078 0.935644 0.929735 0.942825 0.936326 0.945525 0.924025 0.929633
F1 score: 94%```

We can visualize the confusion matrix in the following table:

This table shows how often the model classified each digit correctly in blue, and which digits were most often confused for that label in gray.

## History

• 24th January, 2021: Initial post
• 7th March, 2021: Evaluate model with Confusion Matrix

Written By
Software Developer (Senior)
Egypt

 First PrevNext
 Tensorflow library in C++ Rohit from Bhopal23-Apr-24 21:09 Rohit from Bhopal 23-Apr-24 21:09
 In DigitalNeuralNetwork code, Call to net.test(input, output); in test(net) causes net.mse() value to change. dbaechtel1-Dec-23 8:40 dbaechtel 1-Dec-23 8:40
 NeuraNetwork::test should NOT modify the neuron weights or bias. dbaechtel26-Nov-23 6:19 dbaechtel 26-Nov-23 6:19
 Re: NeuraNetwork::test should NOT modify the neuron weights or bias. Hatem Mostafa26-Nov-23 6:36 Hatem Mostafa 26-Nov-23 6:36
 Re: NeuraNetwork::test should NOT modify the neuron weights or bias. dbaechtel26-Nov-23 12:11 dbaechtel 26-Nov-23 12:11
 Re: NeuraNetwork::test should NOT modify the neuron weights or bias. Hatem Mostafa26-Nov-23 22:13 Hatem Mostafa 26-Nov-23 22:13
 Re: NeuraNetwork::test should NOT modify the neuron weights or bias. dbaechtel27-Nov-23 2:59 dbaechtel 27-Nov-23 2:59
 Re: NeuraNetwork::test should NOT modify the neuron weights or bias. Hatem Mostafa27-Nov-23 6:24 Hatem Mostafa 27-Nov-23 6:24
 unary function is Undefined dbaechtel25-Nov-23 15:29 dbaechtel 25-Nov-23 15:29
 Re: unary function is Undefined Hatem Mostafa26-Nov-23 0:20 Hatem Mostafa 26-Nov-23 0:20
 Re: unary function is Undefined dbaechtel26-Nov-23 11:59 dbaechtel 26-Nov-23 11:59
 Newbie question UrbanBlues25-Nov-23 0:31 UrbanBlues 25-Nov-23 0:31
 Re: Newbie question Hatem Mostafa25-Nov-23 6:43 Hatem Mostafa 25-Nov-23 6:43
 Re: Newbie question UrbanBlues26-Nov-23 19:54 UrbanBlues 26-Nov-23 19:54
 My vote of 5 Ștefan-Mihai MOGA13-Nov-23 16:06 Ștefan-Mihai MOGA 13-Nov-23 16:06
 Bravo Chriistophe13-Oct-23 5:04 Chriistophe 13-Oct-23 5:04
 Re: Bravo Chriistophe13-Oct-23 5:15 Chriistophe 13-Oct-23 5:15
 Arabic ACC TC5-Mar-23 21:30 ACC TC 5-Mar-23 21:30
 Re: Arabic Hatem Mostafa5-Mar-23 21:39 Hatem Mostafa 5-Mar-23 21:39
 ReLu method Member 148349349-Jan-23 18:50 Member 14834934 9-Jan-23 18:50
 perfect mohamed saiid22-Dec-22 10:07 mohamed saiid 22-Dec-22 10:07
 My vote of 5 ZeusT4-Apr-22 7:30 ZeusT 4-Apr-22 7:30
 how well does it recognize alphabetical letters? Southmountain3-Apr-22 6:07 Southmountain 3-Apr-22 6:07
 Re: how well does it recognize alphabetical letters? Chrishka12-Apr-22 11:39 Chrishka 12-Apr-22 11:39
 Re: how well does it recognize alphabetical letters? Hatem Mostafa16-Apr-22 12:16 Hatem Mostafa 16-Apr-22 12:16
 Last Visit: 31-Dec-99 18:00     Last Update: 23-Jul-24 3:49 Refresh 123 Next ᐅ