Click here to Skip to main content
Click here to Skip to main content

Convolutional Neural Network Workbench

By , 9 Dec 2011
 
This is an old version of the currently published article.

Introduction

This article is about a Microsoft C# 4.0 WPF implementation of a framework that allows to create, train, and test convolutional neural networks against the MNIST dataset of handwritten digits or the CIFAR-10 dataset of 10 different natural objects. There is a magnificent article by Mike O'Neill on the The Code Project about the same subject. Without his great article and C++ demo code, this project wouldn't exist. I also relied heavily on Dr. Yann LeCun's paper: Gradient-Based Learning Applied to Document Recognition to understand more about the principles of convolutional neural networks and the reason why they are so successful in the area of machine vision. Mike O'Neill uses Patrice Simard's implementation where the subsampling step is integrated in the structure of the convolutional layer itself. Dr. Yann LeCun uses in his LeNet-5 a separate subsampling step, and also uses non-fully connected layers. The framework presented allows to use all types of layers, and has an additional Max-Pooling layer that you can use instead of plain Average-Pooling. The default squashing function used is tanh() and the value to train for is set to 0.8 because it is the value at the curvature of the second derivative of the used non-linearity so there is less saturation. The input images are all normalised (-1/1), and the input layer is at a fixed 32x32 window.

The Code

The main goal of this project was to build an enhanced and extended version of Mike O'Neill's excellent C++ project. This time written in C# 4.0 and using WPF with a simple MVVM pattern as  the GUI instead of Windows Forms. I've included and used the WPF TaskDialog Wrapper from Sean A. Hanley instead of the Windows API Code Pack because the first is more compact and fit my needs perfectly. Also the WPF ColorPicker component is a copy from Ury Yamshy's article. So Visual Studio 2010 and Windows Vista SP2 are the minimum requirements to use my application. I also made maximal use of the parallel functionality offered in C# 4.0 by letting the user at all times choose how many logical cores are used in the parallel optimised code parts with a simple manipulation of the sliderbar next to the View combobox.

Using the code

Here is the example code to construct a LeNet-5 network in my code (see the InitializeDefaultNeuralNetwork() function in MainViewWindows.xaml.cs):

NeuralNetworks network = new NeuralNetworks("LeNet-5", 0.8D, LossFunctions.MeanSquareError, 0.02D);
network.Layers.Add(new Layers(network, LayerTypes.Input, 1, 32, 32));
network.Layers.Add(new Layers(network, LayerTypes.Convolutional,ActivationFunctions.Tanh, 6, 28, 28, 5, 5));
network.Layers.Add(new Layers(network, LayerTypes.Subsampling, ActivationFunctions.AveragePoolingTanh, 6, 14, 14, 2, 2));

List<bool> mapCombinations = new List<bool>(16 * 6) 
{
 true, false,false,false,true, true, true, false,false,true, true, true, true, false,true, true,
 true, true, false,false,false,true, true, true, false,false,true, true, true, true, false,true,
 true, true, true, false,false,false,true, true, true, false,false,true, false,true, true, true,
 false,true, true, true, false,false,true, true, true, true, false,false,true, false,true, true,
 false,false,true, true, true, false,false,true, true, true, true, false,true, true, false,true,
 false,false,false,true, true, true, false,false,true, true, true, true, false,true, true, true
};

network.Layers.Add(new Layers(network, LayerTypes.Convolutional, ActivationFunctions.Tanh, 16, 10, 10, 5, 5, new Mappings(network, 2, mapCombinations)));
network.Layers.Add(new Layers(network, LayerTypes.Subsampling, ActivationFunctions.AveragePoolingTanh, 16, 5, 5, 2, 2));
network.Layers.Add(new Layers(network, LayerTypes.Convolutional, ActivationFunctions.Tanh, 120, 1, 1, 5, 5));
network.Layers.Add(new Layers(network, LayerTypes.FullyConnected, ActivationFunctions.Tanh, 10));
network.InitWeights();

This is Design view where you can see how the network is defined and see the weights of the convolutional layers. When you hover with the mouse over a single weight, a tooltip shows the corresponding weight value.

Training View

This is Training view where you train the network. The 'Play' button gives you the 'Select Training Parameters' dialog where you can define the basic training parameters. The 'Training Schema Editor' button gives you the possibility to fully define your own training schemas and to save and load them as you want. At any time, the training can be easily aborted by pressing the 'Stop' button.

TrainingSchemaEditor.PNG

In Testing view, you can test your network and get a graphical confusion matrix that represents all the misses.

Calculate View

<<p />

In Calculate view, we can test a single digit with the desired properties and fire it through the network and get a graphical view of all the outputs in every layer.

Final Words

I would love to see a DirectCompute 5.0 integration for offloading the highly parallel task of learning the neural network to a DirectX 11 compliant GPU if one is available. But I've never programmed with DirectX or any other shader based language before, so if there's anyone out there with some more experience in this area, any help is very welcome. I made an attempt to use a simple MVVM structure in this WPF application. In the Model folder, you can find the files for the neural network class and also a DataProvider class which deals with loading and providing the necessary MNIST training and testing samples. There is also a NeuralNetworkDataSet class that is used by the project to load and save neural network definitions, weights, or both (full) from or to a file on disk. Then there is the View folder that contains the four different PageViews in the project and a global PageView which acts as a container for the different views (Design, Training, Testing, and Calculate). In the ViewModel folder, you will find a PageViewModelBase class where the corresponding four ViewModels are derived from. All the rest is found in the MainViewWindows.xaml.cs class. Hope there's someone out there who can actually use this code and improve on it. Extend it with an unsupervised learning stage for example (encoder/decoder construction), or implement a better loss-function (negative log likelihood instead of MSE); extend to more test databases, other than just only the handwritten MNIST; make use of more advanced squashing functions, etc.

Releases

1.0.0.1:

- Now you can see all the weight and bias values in every layer.

- Renaming some items so that they make more sense (KernerlTypes.Sigmoid => ActiviationFunctions.Tanh)

- As a last layer you can use LeCun's RBF layer with fixed weights.

- Now it is possible to uses ActivationFunctions.AbsTanh to have a rectified convolutional layer.

1.0.0.0: Initial release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Filip D'haene
Software Developer
Belgium Belgium
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions


Discussions posted for the Published version of this article. Posting a message here will take you to the publicly available article in order to continue your conversation in public.
 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionUsing my own dataset to train and testmemberjdpsen11-Jan-13 2:42 
First of all Thank You so much Sir for your excellent work. I am doing my M.Tech thesis on Convolutional Neural Nets . I have a hypothesis to test which needs to work with a transformed version of MNIST dataset. Can i make use of this platform to train and test with my own dataset?
AnswerRe: Using my own dataset to train and testmemberFilip D'haene11-Jan-13 4:01 
I don't see any problem in using my code for your purpose. It's after all open-source.
I hope you get a good result with your thesis! Smile | :)
Questionusing other data setsmemberMember 91745039-Jan-13 14:25 
I was wondering if I could use other data sets rather the one you have mentioned here? would it require extensive change to the source code which only you can do or is it something I would be able to do?
 
Thanks
Questionafter simulation weight extractionmemberMember 917450310-Dec-12 9:10 
Hi
 
I constructed my network and trained it, now I would like to look at the weights and biases. I need this information for further manipulation e.g. to see distribution of weights and biases etc. Can you please let me know how I can extract such info from the saved -gz file at the end. I know that I can look at them in the GUI but is there a way to have them in a file?
 
thanks in advance
AnswerRe: after simulation weight extractionmemberFilip D'haene11-Dec-12 2:09 
You can always save your cnn weight and biases to file. (Save as...)
The file normally ends with the .weights-xml-gz extension. You can
now rename this to a .weights-xml.gz extension and then you unzip this
file to a plain text xml file you can edit with any text or xml editor.
Questionhow could i enlarge some layersmembercrazyhj9-Nov-12 0:36 
Hi..
I wanna enlarge C5 from 120 to 360 & Output 10 to 36
I just literally change the 'mapCount' param at Layers 's constructor
 
since i change it the error rate increase to 90% no matter how much i train it
what is the correct way to enlarge the layer??
 
..btw..i m using LeNet-5-RBF code in InitializeDefaultNeuralNetwork()
AnswerRe: how could i enlarge some layers [modified]memberFilip D'haene10-Nov-12 14:12 
These are the steps you must take to expand a cnn with a final RBF layer:
 
For each neuron in the final RBF layer there exist a black and white bitmap in the project under the Resources\Images directory. The pixelsize is exactly 7*12 and the naming is "Image0.bmp" ... "Image9.bmp". You will have to add 26 extra bitmaps as resources in the directory ("Image10.bmp" ... "Image35.bmp"). The next step is changing the GetRbfWeightPatterns() function in the Model\NeuralNetworks.cs file. You will have to expand the number of neurons in the loop to 36. That's it.

modified 10-Nov-12 20:23pm.

Questionrange of weightsmemberMember 917450322-Oct-12 11:12 
Hi,
 
I was wondering what the range of weights are in your simulation? In other words, during the training what is the max and min values of the weights?
 
Thanks
AnswerRe: range of weightsmemberFilip D'haene24-Oct-12 1:51 
You can stop the training at any time and look in Design View
to see the min/max weight values in every layer.
 

filip
GeneralRe: range of weightsmemberMember 917450324-Oct-12 8:34 
Yes, I actually tried that but I was wondering if you have bounded the range within which the weights can change in the code, this is important to me as I am trying to introduce some variations in the training.
GeneralRe: range of weightsmemberFilip D'haene25-Oct-12 12:25 
It's not possible to view the weights graphic while you are training. There's no direct
binding because it would make training very slow if that's what you mean. If your questions is
by how much a weight can change you should read the paper by Dr. Yann Lecun.
GeneralRe: range of weightsmemberMember 917450326-Oct-12 7:52 
I guess my question was vague and I apologize for that, what I meant was what is the range of weights through the entire simulation? For example if you have defined your weights to stay between wmin and wmax and if during the training the change forces them to go beyond these values, the algorithm would be clipping them to remain in this range. Assuming this is how you implemented your algorithm, I would like to know what wmin and wmax are.
thanks
GeneralRe: range of weightsmemberFilip D'haene26-Oct-12 9:54 
I don't use any form of clipping to a specific minimum or maximum for the weight values.
Maybe not such a bad idea and easy to implement. Let me know if it does help the training
performance and accuracy of the cnn.
 
filip
Questionhelp with mathmemberChesnokov Yuriy30-Aug-12 22:56 
I contacted you some time ago over email asking for help with my own implementation of CNN where I had problems with saturation and incorrect training
Would you be able to help me at last? I have not received from you replies?
Чесноков

AnswerRe: help with math [modified]memberFilip D'haene6-Sep-12 12:03 
Yuriy,
 
I've checked the last mail you've send me. The formula for calculating the neuron error
of the cnn layer before the last fully connected layer is correct. This is equation n°5 in Mike's article.
 
filip

modified 7-Sep-12 13:39pm.

GeneralRe: help with mathmemberChesnokov Yuriy10-Sep-12 20:44 
Thank you very much. I've sent you my code and feature maps examples to email.
Please have a look to spot the possible errors.
Чесноков

QuestionProblem in Download source codememberbig_eagle_mk24-Aug-12 19:16 
I tried to download your source code but when I click on link it redirect to the same page without started download, Can you check it please or sending me another download links Confused | :confused:
AnswerRe: Problem in Download source codememberFilip D'haene24-Aug-12 23:18 
Sometimes the download link does not work on The Code Project for unknown reasons.
I've tried it now and all works fine.
 
filip
QuestionSpeedmemberMember 917450311-Jul-12 8:16 
Great work!
I just have a question about the total running time. I have tried running the project once with distortions and once without, in both cases the running time is not comparable to what you have shown for Lenet5 in the picture above. The picture shows that after about 50 min you were at epoch 19. Can you please let me know what I am missing here!
Thanks
AnswerRe: Speed [modified]memberFilip D'haene11-Jul-12 8:51 
Two things: are you running it in Release mode (not in debugging mode)? and what kind of
hardware you're using? I'm using an Intel Core i7 920 C0 overclocked to 3,8 Ghz under Windows 7 x64.
 
filip

modified 11-Jul-12 15:32pm.

GeneralRe: SpeedmemberMember 917450311-Jul-12 10:00 
Thanks for your quick reply!
I am using release mode and running it on two machines:
first machine: Intel Core i7 930 2.8 Ghz under Windows 7 x64
Second machine: Intel Core i5 2450M 2.5 Ghz under Windows 7 x64
AnswerRe: SpeedmemberFilip D'haene11-Jul-12 10:17 
Well, time to overclock that i7 930 to a minimum of 3,6 Ghz or even more. I'm using a cheap Scythe Mugen 2 rev.B with two fans at full speed to keep the temperature under control when training the cnn. Smile | :)
GeneralRe: SpeedmemberMember 917450311-Jul-12 10:28 
Thanks for the info! Thumbs Up | :thumbsup:
GeneralMy vote of 5memberGus Granados25-Jun-12 16:34 
Great work mate
GeneralRe: My vote of 5memberFilip D'haene27-Jun-12 12:29 
Thanks! Smile | :)
QuestionCIFAR10 resultsmemberOve26-May-12 11:26 
Hello and congratulations on a great article.
I saw that in your files, you have included the weights for a trained LeNet-5 network that has 50 errors, which is very good.
I was wondering if you have the weights for a network that was trained with the CIFAR10 dataset.
I want to ask what is the best result that you have reached with the CIFAR10 dataset? And what network architecture did you use to get to that result?
AnswerRe: CIFAR10 resultsmemberFilip D'haene27-May-12 2:03 
Thanks!
 
The best result I achieved was 68.22 error % with MyNet-30.
The reason why the weights are not included is because the zipped upload
limit is 10MB on the Code Project and the file is around 17MB. The design of MyNet-30 you can find in the MainViewWindow.xaml.cs file at line number 198. It is in the InitializeDefaultNeuralNetwork() function and is commented out. We have of course 3 input maps in the input layer for each color channel (RGB) instead of 1 for LeNet-5.
They are symmetrical connected in the supsampling steps and the colors are kept apart until the last convolutional layer. I also doubled the size of the standard CIFAR-10 dataset by horizontaly flipping each image. I must admit I didn't experiment a lot with the CIFAR-10 dataset. I'm pretty sure that with a more appropriate structure of the cnn you can achieve better results.
 
Filip
GeneralRe: CIFAR10 resultsmemberOve27-May-12 3:07 
Thanks for the answer and the details of the network. I saw there were many networks in InitializeDefaultNeuralNetwork(), but I didn't know which one gave the best results.
 
Is it possible for you to upload the network weights on some other site (wetransfer.com, transferbigfiles.com) and post the download link here?
Or maybe even send it by e-mail?
GeneralRe: CIFAR10 resultsmemberFilip D'haene27-May-12 5:46 
You can always reproduce the same or better result with some training yourself (say 24 epochs).
But if you give me an e-mail address I can always try sending it to you.
 
Filip
Questionprogram realizationmemberДарья Прокурат12-May-12 22:44 
1. You save to struct Connection index of Neuron and Weight. Why not to keep references?
2. enum LayerTypes. Why not to use inheritance? With enum works quicker? or for use enum instead of inheritance there were no reasons?
AnswerRe: program realizationmemberFilip D'haene13-May-12 1:44 
1) I've used struct instead of a class for the connections because it doubled my speed of the cnn's. (Connection was in my early versions a full class, but much slower)
 
2) You're right. Thats probably a better manner (faster) instead of using enums. Smile | :)
 
Filip
GeneralRe: program realizationmemberДарья Прокурат13-May-12 2:57 
1. No, I ask not about struct vs class. I want to know, why you don't do such:
public struct Connection
{
    public Neuron ToNeuron { get; private set; }
    public Weight WithWeight { get; private set;}
 
    public Connection(Neuron toNeuron, Weight toWeight):this()
    {
        ToNeuron = toNeuron;
        WithWeight = toWeight;
    }
    public Connection(Weight toWeight): this()
    {
        WithWeight = toWeight;
        ToNeuron = new Neuron { Output = 1 };
    }
}
If there any reason to save index?
public int ToNeuronIndex;
public int ToWeightIndex;

AnswerRe: program realization [modified]memberFilip D'haene13-May-12 4:06 
Well, what your suggesting is an other valid way of doing the same thing.You don't have the need anymore to have references to some neuron or weight index. But you have to consider that the storage of pair of integers is smaller. I'm not so sure it would speed up things. D'Oh! | :doh:

modified 13-May-12 12:38pm.

QuestionActivationFunctionsmemberДарья Прокурат7-May-12 10:49 
Can you give any link or write math formulas for calculation function with different ActivationFunctions?
For example, this is part to ActivationFunctions.MaxPoolingTanh in Layer.Calculate()
double bias = 0D;
double weight = 1D;
List<double> previousOutputs = new List<double>(4);
foreach (Connection connection in neuron.Connections)
{
    if (connection.ToNeuronIndex == int.MaxValue)
        bias = Weights[connection.ToWeightIndex].Value;
    else
    {
        weight = Weights[connection.ToWeightIndex].Value;
        previousOutputs.Add(PreviousLayer.Neurons[connection.ToNeuronIndex].Output);
    }
}
neuron.Output = Sigmoid((previousOutputs.Max() * weight) + bias);
 
in foreach we set weight a lot of times but use in outside of foreach. so it always will be last value. is there mistake? If not, by with math formula it is writen? Why we take max Output value and multiply on last weight and ignore all other weights?
And why we set 4 in "new List(4)"?
AnswerRe: ActivationFunctionsmemberFilip D'haene7-May-12 13:07 
Well there's no mistake in the code because the weight value is shared and has exactly the same value every time unless connection.ToNeuronIndex == int.MaxValue. The new List<double>(4) is arbitrary chosen and doesn't limit the capacity of the List. It will work with less and more than 4 outputs. For a link to more activation functions you best use google and see what's out there. On Dr. LeCun's site you can find plethora of good papers on machine learning.
 
Hope this somehow helps, Smile | :)
Filip
GeneralRe: ActivationFunctionsmemberДарья Прокурат10-May-12 5:34 
LayerTypes.Subsampling
ActivationFunctions.AveragePoolingTanh
foreach (Connection connection in neuron.Connections)
    if (connection.ToNeuronIndex == int.MaxValue)
        dSum += Weights[connection.ToWeightIndex].Value;
    else
        dSum += Weights[connection.ToWeightIndex].Value *
                PreviousLayer.Neurons[connection.ToNeuronIndex].Output * 
                SubsamplingScalingFactor;
neuron.Output = Sigmoid(dSum);
 
Why we use SubsamplingScalingFactor? Why don't wait while Weights changes to become like:
Weights[connection.ToWeightIndex].Value *= SubsamplingScalingFactor

GeneralRe: ActivationFunctionsmemberFilip D'haene10-May-12 5:45 
I think your missing the point of the Calculate step. We want to calculate the new output
values, with the current weight values. We are not gonna change our weight values here.
Only in the backpropagate step we're gonna changes our weight values.
GeneralRe: ActivationFunctionsmemberДарья Прокурат10-May-12 6:24 
Yes. I know. But SubsamplingScalingFactor is const for layer.
So why we can't changes wieght in layer initialization by multiply on SubsamplingScalingFactor?
GeneralRe: ActivationFunctionsmemberFilip D'haene10-May-12 6:50 
I'm really sorry but I don't quite understand your question. What do you mean by
layer initialisation? Do you mean setting the initial weight values of the cnn (SetInitalWeights function)? If so, it's up to you if you want other initial random weights for the subsampling layers.
GeneralRe: ActivationFunctionsmemberДарья Прокурат10-May-12 7:41 
Yes, I mean setting the initial weight values of the cnn (SetInitalWeights function).
If we in the end of SetInitalWeights for this layer add
Weights[connection.ToWeightIndex].Value *= SubsamplingScalingFactor
and in all other places delete
* SubsamplingScalingFactor
will it works like your variant? Or am I missing something important about SubsamplingScalingFactor?
AnswerRe: ActivationFunctionsmemberFilip D'haene10-May-12 8:01 
No, that's not going to work. I can only recommend
to read Dr. LeCun excellent paper.
Questioncrashes while downloading training imagesmemberSperneder Patrick5-May-12 8:33 
Hello Filip!
i hope you can help me, i get an AggregateException while the application is starting up.
The exception occurs right before downloading the training images.
AnswerRe: crashes while downloading training imagesmemberFilip D'haene5-May-12 9:01 
You must delete the CNNWB folder in My Documents and then restart the latest version of
the program. Maybe your internet connection was diconnected or so.
 
Hope this somehow helps, Smile | :)
Filip
GeneralRe: crashes while downloading training imagesmemberSperneder Patrick8-May-12 3:25 
Hello!
Deleting the folder and running the app again did not solve the problem.
My internet-connection definitely is stable. Frown | :(
Somehow the thrown exception sounds to me as if there is a problem with multithreading ? Confused | :confused:
AnswerRe: crashes while downloading training imagesmemberFilip D'haene8-May-12 4:13 
Can you tell me the operating system you're using (including the service pack and 32-bit or 64bit version) and if
you're using the latest version of the program. Also are you running it from the source code with Visual Studio 2010 or just using the setup version.
 
(I remember when changing to .NET 4.5 in Windows 8 Preview I was getting errors when downloading the needed files)
GeneralRe: crashes while downloading training imagesmemberSperneder Patrick10-May-12 7:30 
Hello Filip,
Sorry for the delay, i'm using Windows 7 Home Premium SP1, 64bit. Running the app with attached Debugger(VS 2010 Ultimate)
and also running it from the setup leads to the same error.
Installed .NET is 4.0 with latest Servicepacks..
weird... Dead | X|
regards Patrick
AnswerRe: crashes while downloading training images [modified]memberFilip D'haene10-May-12 8:16 
Thats very weird indeed, because I'm runny on exaxcly the same operating system and version of Visual Studio. Two things I'm curious about: wich type of processor do you have and if you're sure your using the .NET Framework 4 Client Profile in the Application Properies of the CNNWB project?
 
Filip

modified 10-May-12 14:43pm.

GeneralRe: crashes while downloading training imagesmemberSperneder Patrick10-May-12 9:39 
Big Grin | :-D this seems to be an evil .NET trap....
 
Processor : Intel Pentium CPU B960 @2.20GHz 2.20GHz ( Dualcore )
and yes, i am using the .NET 4.0 Client Profile
if you send me your e-mail adress, i can send you a stacktrace and show you exactly where the app shoots herself in the knee... Poke tongue | ;-P
mine is patrick.sperneder@gmx.at
 
regards
P.
AnswerRe: crashes while downloading training imagesmemberFilip D'haene14-May-12 18:49 
The lastest version should resolve your download problems.
 
greetings,
Filip
QuestionDownload links do not work... please checkmemberfawzi_masri10-Apr-12 10:42 
source code links are not functional,can you please check it.
 

thanks,

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130617.1 | Last Updated 9 Dec 2011
Article Copyright 2010 by Filip D'haene
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid