Download CNNWB Sources (21,3 MB)
Download Setup (7,5 MB)
Introduction
This article is about a framework in C# 4.0 that allows to create, train, and test convolutional neural networks against the MNIST and the CIFAR-10 dataset of 10 different natural objects. I initially based me on an article by Mike O'Neill on the The Code Project and gradually added new features that I've found interesting in research documents found on the internet. Dr. Yann LeCun's paper: Gradient-Based Learning Applied to Document Recognition is a great paper to get a better understanding of the principles of convolutional neural networks and the reason why they are so successful in the area of machine vision.
The Code
The main goal of this project was to build a more flexible and extendable managed version of Mike O'Neill's excellent C++ project. I've included and used the splendid WPF TaskDialog Wrapper from Sean A. Hanley, the Extended WPF Toolkit and for unzipping the CIFAR-10 dataset the open-source SharpDevelop SharpZipLib module. Visual Studio 2012/2013 and/or Windows 7 are the minimum requirements. I made maximal use of the parallel functionality offered in C# 4.0 by letting the user at all times choose how many logical cores are used in the parallel optimized code parts with a simple manipulation of a sliderbar next to the View combobox.
Using the Code
Here is the example code to construct a LeNet-5 network in code (see the InitializeDefaultNeuralNetwork()
function in MainViewWindows.xaml.cs):
NeuralNetwork cnn = new NeuralNetwork
(DataProvider, "LeNet-5", 10, 0.8D, LossFunctions.MeanSquareError,
DataProviderSets.MNIST, TrainingStrategy.SGDLevenbergMarquardt, 0.02D);
cnn.AddLayer(LayerTypes.Input, 1, 32, 32);
cnn.AddLayer(LayerTypes.Convolutional, ActivationFunctions.Tanh, 6, 28, 28, 5, 5);
cnn.AddLayer(LayerTypes.AveragePooling, ActivationFunctions.Tanh, 6, 14, 14, 2, 2);
bool[] maps = new bool[6 * 16]
{
true, false,false,false,true, true, true, false,false,true, true, true, true, false,true, true,
true, true, false,false,false,true, true, true, false,false,true, true, true, true, false,true,
true, true, true, false,false,false,true, true, true, false,false,true, false,true, true, true,
false,true, true, true, false,false,true, true, true, true, false,false,true, false,true, true,
false,false,true, true, true, false,false,true, true, true, true, false,true, true, false,true,
false,false,false,true, true, true, false,false,true, true, true, true, false,true, true, true
};
cnn.AddLayer(LayerTypes.Convolutional, ActivationFunctions.Tanh, 16, 10, 10, 5, 5, new Mappings(maps));
cnn.AddLayer(LayerTypes.AveragePooling, ActivationFunctions.Tanh, 16, 5, 5, 2, 2);
cnn.AddLayer(LayerTypes.Convolutional, ActivationFunctions.Tanh, 120, 1, 1, 5, 5);
cnn.AddLayer(LayerTypes.FullyConnected, ActivationFunctions.Tanh, 10);
cnn.InitializeWeights();
Design View
In Design View you see how your network is defined and get a good picture of the current distribution of weight values in all the layers concerned.
Training View
In Training View you obiviously train the network. The 'Play' button gives you the 'Select Training Parameters' dialog where you define the training parameters. The 'Training Scheme Editor' button gives you the possibility to make training schemes to experiment with. At any time the training can be paused or aborted. If the training is pauzed you can save its weights. The 'Star' button will forget (random reset) all the learned weight values in each layer.
Testing View
In Testing View you get a better picture of the testing (or training) samples which are not recognized correctly.
Calculate View
In Calculate View we test a single testing or training sample with the desired properties and get a graphical view of all the output values in every layer.
Final Words
I would love to see a GPU integration for offloading the highly parallel task of learning the neural network. I made an attempt to use a simple MVVM structure in this WPF application. In the Model folder you will find the NeuralNetwork
and DataProvider
class which provide all the neural network code and deals with loading and providing the necessary training and testing samples. Also a NeuralNetworkDataSet
class is used to load and save neural network definitions. The View folder contains four different PageViews
and a global PageView
which acts as the container for all the different views (Design
, Training
, Testing
and Calculate
). Hope there's someone out there who can actually use the code and improve on it. Extend it with an unsupervised learning stage, (encoder/decoder construction), implement better loss-functions, more training strategies (conjugate gradient, l-bgfs, ...), more datasets, better activation fuctions, ...
History
- 1.0.3.7: (07-08-14) (updated download of Setup & Sources on 07-21-14)
- BugFix: slow speed resolved in Testing View
- Added SGDLevenbergMarquardtModA training strategy. This can be used with a softmax output layer
- Posibility to save the weights while training. Just click on Pause and then Save/Save as...
- Various smaller fixes and optimizations
- Choice between four Training Strategies:
- SGDLevenbergMarquardt
- SGDLevenbergMarquardtMiniBatch
- SGD
- SGDMiniBatch
- BugFix: Derivative of ReLU's activation functions
- Added SoftSign activation function
- Overall 20% more training performance over previous version
- Faster binary save of the network weights
- Various smaller fixes and optimizations
- StochasticPooling and L2Pooling layers now correctly implemented
- Native C++ implementation + managed C++/CLI wrapper (removed from now)
- Support for Stochastic Pooling layers
- Much faster binary load and save of a cnn
- ReLU activation function now working as expected
- Bugfix: Download datasets now working as expected
- Bugfix: Softmax function corrected
- Bugfix: DropOut function corrected
- Bugfix: Local layer and Convolutional layer now works properly
- Bugfix: Cross Entropy loss now works better (in combination with a SoftMax activation function in a final fully connected layer)
- Added LeCun's standard LeNet-5 training scheme
- Now the last min/max display preference is saved
- Added some extra predefined training parameters and schemes
- Bugfix: Average error not showing correctly after switching between neural networks with a different objective function
- Bugfix: Sample not always showing correctly in calculate view
- Bugfix: The end result in testing view is not displaying the correct values
- Supports dropout
- Supports the Local layer type (Same as the convolution layer but with non-shared kernel weights)
- Supports padding in the Local and Convolution layers (gives ability to create deep networks)
- Supports overlapping receptive fields in the pooling layers
- Supports weightless pooling layers
- Supports all the commonly used activation functions
- Supports Cross Entropy objective function in combination with a SoftMax activation function in the output layer
- Ability to specify a density percentage to generate mappings between layers far more easily
- Improved
DataProvider
class with much reduced memory footprint and a common logical functionality shared across all datasets (easier to add datasets)
- Much improved UI speed and general functionality