This article is about a C# 4.0 framework that allows to create, train and test convolutional neural networks against the MNIST and the CIFAR-10 dataset of 10 different natural objects. I initially based me on an article by Mike O'Neill on the The Code Project and gradually added new features that I've found interesting in research documents found on the internet. Dr. Yann LeCun's paper: Gradient-Based Learning Applied to Document Recognition is a great paper to get a better understanding of the principles of convolutional neural networks and the reason why they are so successful in the area of machine vision.
The main goal of this project was to build a more flexible and extendable managed version of Mike O'Neill's excellent C++ project. I've included and used the excellent WPF TaskDialog Wrapper from Sean A. Hanley, the Extended WPF Toolkit and for unzipping the CIFAR-10 dataset the open-source SharpDevelop SharpZipLib module. Visual Studio 2010 and Windows Vista SP2 are the minimum requirements (or just the OS if using only the setup). I made maximal use of the parallel functionality offered in C# 4.0 by letting the user at all times choose how many logical cores are used in the parallel optimized code parts with a simple manipulation of a sliderbar next to the View combobox.
Using the Code
Here is the example code to construct a LeNet-5 network in my code (see the
InitializeDefaultNeuralNetwork() function in MainViewWindows.xaml.cs):
NeuralNetworks cnn = new NeuralNetworks("LeNet-5", 10, 0.8D, LossFunctions.MeanSquareError, DataProviderSets.MNIST, 0.02D);
cnn.AddLayer(new Layers(cnn, LayerTypes.Input, 1, 32, 32));
cnn.AddLayer(new Layers(cnn, LayerTypes.Convolutional, ActivationFunctions.Tanh, 6, 28, 28, 5, 5));
cnn.AddLayer(new Layers(cnn, LayerTypes.AveragePooling, ActivationFunctions.Tanh, 6, 14, 14, 2, 2));
bool maps = new bool[6 * 16]
true, false,false,false,true, true, true, false,false,true, true, true, true, false,true, true,
true, true, false,false,false,true, true, true, false,false,true, true, true, true, false,true,
true, true, true, false,false,false,true, true, true, false,false,true, false,true, true, true,
false,true, true, true, false,false,true, true, true, true, false,false,true, false,true, true,
false,false,true, true, true, false,false,true, true, true, true, false,true, true, false,true,
false,false,false,true, true, true, false,false,true, true, true, true, false,true, true, true
cnn.AddLayer(new Layers(cnn, LayerTypes.Convolutional, ActivationFunctions.Tanh, 16, 10, 10, 5, 5, new Mappings(2, maps)));
cnn.AddLayer(new Layers(cnn, LayerTypes.AveragePooling, ActivationFunctions.Tanh, 16, 5, 5, 2, 2));
cnn.AddLayer(new Layers(cnn, LayerTypes.Convolutional, ActivationFunctions.Tanh, 120, 1, 1, 5, 5));
cnn.AddLayer(new Layers(cnn, LayerTypes.FullyConnected, ActivationFunctions.Tanh, 10), true);
In Design View you can see how your network is defined and get a good picture of the current distribution of weight values in all the layers concerned.
In Training View you train the network. The 'Play' button gives you the 'Select Training Parameters' dialog where you define all the training parameters. The 'Training Scheme Editor' button gives you the possibility to make your own training schemes to experiment with. At any time, the training can easily be paused or aborted. The 'Star' button will forget (reset) all the learned weight values.
In Testing View you get a better picture of the testing (or eventually training samples) which are not recognized correctly .
In Calculate View we can test a single testing or training sample with the desired properties and get a graphical view of all the output values in every layer.
I would love to see a DirectCompute 5.0 integration for offloading the highly parallel task of learning the neural network to a DirectX 11 compliant GPU if one is available. But I've never programmed with DirectX or any other shader based language before, so if there's anyone out there with some more experience in this area, any help is very welcome. I made an attempt to use a simple MVVM structure in this WPF application. In the Model folder, you can find the files for the neural network class and also a
DataProvider class which deals with loading and providing the necessary MNIST and CIFAR-10 training and testing samples. There is also a
NeuralNetworkDataSet class that is used by the project to load and save neural network definitions, weights, or both (full) from or to a file on disk. Then there is the View folder that contains the four different
PageViews in the project and a global
PageView which acts as a container for the different views (
Calculate). In the ViewModel folder, you will find a
PageViewModelBase class where the corresponding four
ViewModels are derived from. All other coded functionality is found in the MainViewWindows.xaml.cs class. Hope there's someone out there who can actually use this code and improve on it. Extend it with an unsupervised learning stage for example (encoder/decoder construction), or implement better loss-functions; implement stochastic conjugate gradient descent; extend to more test databases; make use of more advanced squashing functions, etc.
- Bugfix: Download datasets now working as expected. (thanks stevic & co)
- Bugfix: Softmax function corrected.
- Bugfix: DropOut function corrected.
- Bugfix: Local layer and Convolutional layer now works properly.
- Bugfix: Cross Entropy loss now works better (in combination with a SoftMax activation function in a final fully connected layer).
- Added LeCun's standard LeNet-5 training scheme.
- Now the last min/max display preference is saved.
- Added some extra predefined training parameters and schemes.
- Bugfix: Average error not showing correctly after switching between neural networks with a different objective function.
- Bugfix: Sample not always showing correctly in calculate view.
- Bugfix: The end result in testing view is not displaying the correct values.
- Supports dropout.
- Supports the Local layer type. (Same as the convolution layer but with non-shared kernel weights)
- Supports padding in the Local and Convolution layers. (gives ability to create deep networks)
- Supports overlapping receptive fields in the pooling layers.
- Supports weightless pooling layers.
- Supports all the commonly used activation functions.
- Supports Cross Entropy objective function in combination with a SoftMax activation function in the output layer.
- Ability to specify a density percentage to generate mappings between layers far more easily.
DataProvider class with much reduced memory footprint and a common logical functionality shared across all datasets (easier to add datasets).
- Much improved UI speed and general functionality.