Add your own alternative version
Stats
319.3K views 31.6K downloads 305 bookmarked
Posted
28 Dec 2010

Comments and Discussions



Good afternoon, dear respective Filip D'haene!
Could you please say does the SGDLevenbergMarquardt implemented here is the Stochastic Diagonal LevenbergMarquardt which uses Diagonal Hessian Matrix of Second order partial derivatives? Because on the internet the only Stochastic version of Levenberg Marquardt Algorithm is Stochastic Diagonal LevenbergMarquardt described at "Gradient Based Learning Applied to Document Recognition, page 41 appendix C". May I ask you how it would be possible to see the contents of diagonal Hessian matrix after each epoch? Also would it be possible to avoid calculation of second order derivatives if I would approximate diagonal Hessian matrix by JTJ formula from classic LM method (Jacobian multiplied by Jacobian)? If yes, could you please say how it would reflect on quality of recognition? I would be very grateful to receive a reply! Thank you!





Good evening, dear respective Mr. Filip D'haene!
If is it possible may I ask you few questions regarding to Convolutional Neural Network Workbench? Could you please in what lines of code is it possible to add modification so it would be available to see the contents of all feature maps and kernels in each layer of LeNet5. Would it be possible to launch Convolutional Neural Network Workbench in forward propagation mode, so that I can submit one MNIST image and see the output of last fully connected layer with 10 neurons, could you also please say where I should add additional code to see all generated weights? Thank you in advance for reply!
Sincerely





Hey,
Thank you for your application. I ran it on my laptop for 41 hours to get an efficiency of only 58%. What tweaks to the design should one make to increase it's efficiency?
Also, could you please help us with the change of dataset? We intend to use a simpler dataset with already eliminated background, but with about 300 classes of objects (nonliving).
Could you please get us started with this?
Thank You





Lovely program. And thank you so much for sharing it with us.
I would like to change the database and include some of my own classes and work on improving the efficiency for the program.
Do you have a documentation, or some links through which I can understand the nittygritties of the code.
We are trying to integrate the code with a robotic arm, therefore the efficiency is of paramount importance. So is the change of database.
Could you please help us with this?
Sincere request





Hi,
It is me again.
What is confusing me: in CNN I used before
NeuralNetwork network = new NeuralNetwork(DataProvider, "CNNCIFAR10Z2", 10, 1D, LossFunctions.CrossEntropy, DataProviderSets.CIFAR10, TrainingStrategy.SGDLevenbergMarquardt);
network.AddLayer(LayerTypes.Input, 3, 32, 32);
bool[] maps = new bool[3 * 64]
{.............................................};
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 28, 28, 5, 5, 1, 1, 0, 0, new Mappings(maps));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 14, 14, 3, 3, 2, 2);
network.AddLayer(LayerTypes.LocalResponseNormalizationCM, ActivationFunctions.None, 64, 14, 14, 3, 3);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 10, 10, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 1));
network.AddLayer(LayerTypes.LocalResponseNormalizationCM, ActivationFunctions.None, 64, 10, 10, 3, 3);
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 5, 5, 3, 3, 2, 2);
network.AddLayer(LayerTypes.Local, ActivationFunctions.ReLU, 64, 1, 1, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 2));
network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 5, 5, 1, 1, 0, 0, 50);
network.AddLayer(LayerTypes.FullyConnected, ActivationFunctions.SoftMax, 10);
network.InitializeWeights();
...................................................................................
each neuron in the local layer has several separate bias connections.
For example, in the layer 7 the neuron #0 has biases with weight indexes #26, #52, etc (the map #0 is connected to the previous maps #1, #2, etc.) Generally, if the map size is 1 x 1, number of neuron biases is equal to number of previous maps it is connected. Is it by design?
in the layer #7 network.AddLayer(LayerTypes.Local, ActivationFunctions.ReLU, 64, 1, 1, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 2));





Hi,
There's a mistake in the network definition above. More precisely in the definition of layer #8. You can't have a layer with a receptive field of 5x5 in that position because the size of a map in layer #7 is exactly 1x1. Try changing the receptive field size in layer #8 to 1x1 instead of 5x5.





Yes I know it. On my computer the receptive field is 1 x 1. But this exacerbates the problem I am reporting: each neuron in the layer 8 has 64 biases, all with different weight indexes.





You're absolutely right!!! That's a very nasty big bug! Thanks for debugging the code!
Try changing the code for the Local layer type to this:
WeightCount = (totalMappings * MapSize * ReceptiveFieldSize) + NeuronCount;
...
if (!IsFullyMapped)
{
int mapping = 0;
int[] mappingCount = new int[MapCount * PreviousLayer.MapCount];
for (int curMap = 0; curMap < MapCount; curMap++)
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
mappingCount[prevMap + (curMap * PreviousLayer.MapCount)] = mapping;
if (Mappings.IsMapped(curMap, prevMap, MapCount))
mapping++;
}
Parallel.For(0, MapCount, curMap =>
{
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
int positionPrevMap = prevMap * maskSize;
if (Mappings.IsMapped(curMap, prevMap, MapCount))
{
int iNumWeight = (mappingCount[prevMap + (curMap * PreviousLayer.MapCount)] * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], curMap);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)] + positionPrevMap;
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
}
}
});
}
else
{
if (totalMappings > MapCount)
{
Parallel.For(0, MapCount, curMap =>
{
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
int positionPrevMap = prevMap * maskSize;
int mapping = prevMap + (curMap * PreviousLayer.MapCount);
int iNumWeight = (mapping * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], curMap);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)] + positionPrevMap;
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
}
});
}
else
{
Parallel.For(0, MapCount, curMap =>
{
int iNumWeight = (curMap * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], curMap);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)];
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
});
}
}





Sorry, made a mistake. Should be:
if (!IsFullyMapped)
{
int mapping = 0;
int[] mappingCount = new int[MapCount * PreviousLayer.MapCount];
for (int curMap = 0; curMap < MapCount; curMap++)
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
mappingCount[prevMap + (curMap * PreviousLayer.MapCount)] = mapping;
if (Mappings.IsMapped(curMap, prevMap, MapCount))
mapping++;
}
Parallel.For(0, MapCount, curMap =>
{
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
int positionPrevMap = prevMap * maskSize;
if (Mappings.IsMapped(curMap, prevMap, MapCount))
{
int iNumWeight = (mappingCount[prevMap + (curMap * PreviousLayer.MapCount)] * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], position);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)] + positionPrevMap;
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
}
}
});
}
else
{
if (totalMappings > MapCount)
{
Parallel.For(0, MapCount, curMap =>
{
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
int positionPrevMap = prevMap * maskSize;
int mapping = prevMap + (curMap * PreviousLayer.MapCount);
int iNumWeight = (mapping * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], position);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)] + positionPrevMap;
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
}
});
}
else
{
Parallel.For(0, MapCount, curMap =>
{
int iNumWeight = (curMap * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], position);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)];
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
});
}
}





Thanks for reply.
The net result is right, but we have multiple assignments of the same bias.
For example, for map 1 x 1, we will assign bias #0 for every previous map connected to the neuron (map) #0. Not a big deal for 64 neurons, but I saw article with many thousands of neurons in a layer.
It seems that because each and every neuron has a bias, and you placed the biases in the beginning of the weight array, it might be simpler just assign the biases to connections outside of the previousMap loop.





Hi,
I understand your reasoning, but I don't see a proper way to implement it like you describe whithout altering all the fprop, bprop & bbprop steps. I'm currently not using this codebase anymore for myself. Have now a much faster c++ implementation I'm still tinkering on. Thanks anyway for debugging the code!





Yes, C++ is faster.
Interested...





Fix that you suggested, indeed, links the biases to the right weights. But it introduces a new connection[][i] in Connections for each connected previous map.
For example, the layer 7 (Local) consists of 64 maps size 1 x1. Maps are connected to the previous layer's 64 x 5 x 5 maps. The first map is not connected to the first map, but is connected to the previous maps #2 and #3. The function AddBias(Connections[posotion], position) is called on position #0 for each connected map. On each call it resizes the array Connections[][] and adds the new connections to the end of the array.
As a result, we have
Connections[0][0] with Neuron ID MAX_INT and Weight ID 0
Connections[0][26] with Neuron ID MAX+INT and Weight ID 0,
etc. 0
So for biases we still have many connections to the same weight (bias.)
Does it compromise forward and backdrop calculations? Seems like for forward calculations it adds bias for layer's neuron multiple times.





Maybe something like this will address the issue:
Parallel.For(0, MapCount, curMap =>
{
for (int prevMap = 0; prevMap < PreviousLayer.MapCount; prevMap++)
{
int positionPrevMap = prevMap * maskSize;
if (Mappings.IsMapped(curMap, prevMap, MapCount))
{
int iNumWeight = (mappingCount[prevMap + (curMap * PreviousLayer.MapCount)] * ReceptiveFieldSize * MapSize) + NeuronCount;
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row * ReceptiveFieldWidth)] + positionPrevMap;
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
}
}
});
for (int i=0; i < NeuronCount; i++)
AddBias(ref Connections[i], i);





Yes, I already did it. I have places the for loop before parallel_for. No difference.





Don't you notice a speed improvement in training time?





I did not look at training time, because, first, C# is not so quick comparing to C++, and, second, IMHO there is a lot of corrections to speed up the existing C# program. What I am doing, I am learning from the great knowledge of ANN field you have embedded in the program. I appreciate it very much.
Years ago I have compared C# and C++ versions of the same small and simple ANN program and got about 70% gain for C++. I am not sure that comparison was correct. It was before MS Concurrency





Thanks! The speed of c# will be much better in the next generation of the .NET framework with .NET native.





Not sure if .NET Native will really improve this kind of thing. Being managed requires runtime bounds checks, which, given all the array indexing, is probably the culprit here. Some of those can be optimized away by a compiler, but most remain. You can work around that in C# using "unsafe" constructs.





Hi,
I have tried your Workbench on one of networks you suggested:
NeuralNetwork network = new NeuralNetwork(DataProvider, "CNNCIFAR10Z2", 10, 1D, LossFunctions.CrossEntropy, DataProviderSets.CIFAR10, TrainingStrategy.SGDLevenbergMarquardt);
When adding a local layer
network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 5, 5, 1, 1, 0, 0, 50);
the application crashes.
The reason is reading beyond boundaries.
The maskMatrix for this layer is:
maskMatrix = new int[maskSize * PreviousLayer.MapCount];
where maskSize = 1<code> and <code>PreviousLayer.MapCount = 64
So maskMatrix has 64 entries.
But when setting connections for a local layer we have:
for (int y = 0; y < MapHeight; y++)
for (int x = 0; x < MapWidth; x++)
{
int position = x + (y * MapWidth) + (curMap * MapSize);
AddBias(ref Connections[position], iNumWeight++);
int pIndex;
for (int row = 0; row < ReceptiveFieldHeight; row++)
for (int column = 0; column < ReceptiveFieldWidth; column++)
{
pIndex = x + (y * maskWidth) + kernelTemplate[column + (row*receptiveFieldWidth)] + positionPrevMap;
if (maskMatrix[pIndex] != 1)
AddConnection(ref Connections[position], maskMatrix[pIndex], iNumWeight++);
}
}
Because makWidth = 1 , ReceptiveField is 5x5, and max of positionPrevMap is 63, the max of pIndex is 71. This is well out of boundaries of maskMatrix[pIndex] 64.
Any help?
By a way, what is the Local Layer?





Hi,
Can you please give me the definition of the whole network you would like to construct.
A Local connected layer is like a convolutional layer but without the weight sharing.





Thank you for reply.
The network is from NeuralNetwork InitializeDefaultNeuralNetwork() . I just uncommented the definition:
NeuralNetwork network = new NeuralNetwork(DataProvider, "CNNCIFAR10Z2", 10, 1D, LossFunctions.CrossEntropy, DataProviderSets.CIFAR10, TrainingStrategy.SGDLevenbergMarquardt);
network.AddLayer(LayerTypes.Input, 3, 32, 32);
bool[] maps = new bool[3 * 64]
{.............................................};
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 28, 28, 5, 5, 1, 1, 0, 0, new Mappings(maps));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 14, 14, 3, 3, 2, 2);
network.AddLayer(LayerTypes.LocalResponseNormalizationCM, ActivationFunctions.None, 64, 14, 14, 3, 3);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 10, 10, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 1));
network.AddLayer(LayerTypes.LocalResponseNormalizationCM, ActivationFunctions.None, 64, 10, 10, 3, 3);
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 5, 5, 3, 3, 2, 2);
network.AddLayer(LayerTypes.Local, ActivationFunctions.ReLU, 64, 1, 1, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 2));
network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 5, 5, 1, 1, 0, 0, 50);
network.AddLayer(LayerTypes.FullyConnected, ActivationFunctions.SoftMax, 10);
network.InitializeWeights();
...................................................................................
The exception is thrown when the layer previous to the last layer is instantiating.
This is not a bug exactly; it is a violation of an implicit constraint.
Obviously, a receptive field should fit into a map of its previous layer. But there we have the previous map 1 x 1 neurons, and receptive field is 5 x 5 neurons. So , because the mask has dimensions of the previous layer map, and the maskMatrix consosts of the previous layer's mapCout masks, we are going out of maskMatrix boundaries when we instantiate connections to the last of previous layer's maps.
Correction to network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 1, 1, 1, 1, 0, 0, 50) solves the problem, but with it this layer becomes just a full connected layer.
If we want to connect each map to the 25 (5 x 5) previous maps, we have to use mapping.
The similar configuration is in other (commented out) network in NeuralNetwork InitializeDefaultNeuralNetwork() .
I think it will not hurt to add some validation (exception) for this constraint to AddLayer() . If C# has something like static_assert of C++, compile time check (meta function) would be an excellent solution.





More and better input validation is always a good practice. I didn't put enough time in it, I shall give it more effort on a next release or project.





I've Add another DataProviderSet, modified mainly from CIFAR10 to do regression, specifically locate 8 key points. I changed the Output neural number to 16. However I am quite clear what the code below is for.
for (int i = 0; i < ClassCount; i++)
D2ErrX[i] = 1D;
Well in my case all the outputs of the neurons represent the relative location of the points. Would you please clarify it for me?





This value must be the second derivative of the cost function.
for MSE (0.5*sumof( (actualtarget)^2 )) this differential is 1, for Cross Entropy I'm not sure. Don't use TrainToValue this is plain wrong. It only matters when you're using LevenbergMarquardt based learning strategies.





Hi, can u check the use of iNumWeight in Parallel.For in NeuralNetwork.cs?
Looks like needs interlock for the counter.
Good stuff. Thanks for sharing.





Hello,
My question is when one of the ten neurons of the last layer is positive, then it is the recognized char. But if we want to know the prob. or a normalized value between 0100 of this recognition how can we calculate it?





The second layer of the LeCun network is 6 maps by 14x14 neurons each. The reception field for each neuron is 2x2. It gives 5880 connections for not overlapping field. The same nunber is in LeCun article. Your program sets 11262 connections. The neuron #0 has 5 connections (bias + 2x2), that is right, but the neuron #1 has 7 connections, and so on.





Hi,
I've just checked it and neuron #1 has only 5 connections in the program.
Just do MessageBox.Show(network.Layers[2].Connections[1].Count().ToString()); after network.InitializeWeights(); to get the number of connections in neuron #1 you're talking about. I don't see any error.





1. Regarding d(tanh(x))/dx: OK, but should be noted in a comment, I suppose.
2. I ran LeCun again. At breakpoint at line 3275 for the Layer 2 of LeCun I have
 Connections {CNNWB.Model.Connection[1176][]}
 [0] {CNNWB.Model.Connection[5]}
 [1] {CNNWB.Model.Connection[7]}
...................................
+ [14] {CNNWB.Model.Connection[7]}
+ [15] {CNNWB.Model.Connection[10]}
...................................................................
etc.
This is a snapshot from the QuickWatch window.
It has 10 connections through 195 neuron; the neuron #196 has 5 connections again, #197 has 7 connections, and so on.
It is under VS 2013 Ultimate Update 1 NET 4.5 Debug Win32 configuration, OS Win7 Pro.





Regarding the averaging layer #2 in LeCun network: you have used default values for strideX and strideY; must be strideX = strideY = 2





In code sample
cnn.AddLayer(LayerTypes.Convolutional, ActivationFunctions.Tanh, 16, 10, 10, 5, 5, new Mappings(maps));
default parameters strideX, strideY, padX, padY must be explicitly passed to layer because the Mapping has not default value.
Also for activation function Tanh you defined DTanh(x) = 1  x^2. Actually the derivative of tanh d(tanh(x))/dx = 1  (tanh(x))^2. Is your definition correct?
By a way, the Constitutional Neural Network by Dr. LeCun is different. What is in your code sample is closer to Maurice, Mesman, Bart, Corporaal, Henk of Eindhoven University of Technology, the Netherlands,"SPEED SIGN DETECTION AND RECOGNITION BY CONVOLUTIONAL NEURAL NETWORKS." Not a big deal.





Hi,
You're right about the code sample. I will fix this in a next update.
Regarding the derivative of the tanh activation function: the derivative of tanh is indeed d(tanh(x))/dx = 1  (tanh(x))^2, but you must concider that you've already made the calculation of tanh(x) in the forward propagation step. This is the reason we substitute to 1x^2 in the backpropagation step. Here the value of x has the value of tanh(x) we obtained in the forward propagation step.





Congrat. Your project is very interesting.
But with my poor knowledge on NNT for me and i sure that for other it's difficult to modify the code and the datasets to make our owns experiments of handwritten recognition system.
Please give us a litlle tips or tutorial if we want to create an example of complete abecedary ocr.
¿how must we create the network and layers?
¿what are the best training parameters for 38 classes ('abcde...')?
¿how can we create a dataset to fit in your data provider?
I think this answer would be very interesting for all of us.
Thank you very much, and again congrat. for your work!





I completely agree. Could you please help us..
Thanks





I set my own data set for training and testing but it can't work correctly on the workbench .
I'v created a image data set that format like "CIFAR10" but whit 4 bytes label because it more than 9 categories. and then i'v changed the data set parsing method for adapting new format.it passed the pack and unpack test .
I have set a new NuralNetwor instance and disabled others and add some new layers on it,my image's size is 65*53 rather than 32*32 .
I know that the parameter's passed in 'AddLayer 'mothod must be calculated , but i'v no idea about that, because i cannot find document's or manual in the solution files.I was confused when i was setting.
Anothor problem is the size of the bool array maps, what's that and what size is the correct one?
After these changes The app can not work because the exceptions about 'index out of range '.
I konw that because the AddLayer method not passed correct params ,the next layer is depend on the above one.but I do not know how to set it correctly.
Can i get some help about the 'bool array maps' and some directions on setting the parameter's of add each layer.
NeuralNetwork network = new NeuralNetwork(DataProvider, "TencentCaptcha CNN", 10, 1D, LossFunctions.CrossEntropy, DataProviderSets.TencentCaptcha, TrainingStrategy.SGDLevenbergMarquardt);
network.AddLayer(LayerTypes.Input, 3, 65, 53);
bool[] maps = new bool[3 * 64]
{
true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true,
false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true,
false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true,false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true
};
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 66, 14, 10, 10, 1, 1, 0, 0, new Mappings(maps));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 32, 33, 5, 5, 2, 2);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 28, 30, 5, 4, 1, 1, 0, 0, new Mappings(64, 64, 66, 1));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 14, 15, 3, 3, 2, 2);
network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 5, 5, 1, 1, 0, 0, 50);
network.AddLayer(LayerTypes.FullyConnected, ActivationFunctions.SoftMax, 2704);
network.InitializeWeights();
update:
i found the logic and fixed the bugs.it works with no exceptions now but i'm not sure if the setting is correct.
NeuralNetwork network = new NeuralNetwork(DataProvider, "TencentCaptcha CNN", 10, 1D, LossFunctions.CrossEntropy, DataProviderSets.TencentCaptcha, TrainingStrategy.SGDLevenbergMarquardt);
network.AddLayer(LayerTypes.Input, 3, 65, 53);
bool[] maps = new bool[3 * 64]
{
true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true,
false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true,
false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true,false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true
};
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 56, 44, 10, 10, 1, 1, 0, 0, new Mappings(maps));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 28, 22, 5, 5, 2, 2);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 24, 18, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 1));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 12, 8, 3, 3, 2, 2);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 8, 4, 4, 4, 1, 1, 0, 0, new Mappings(64, 64, 66, 1));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 4, 2, 2, 2, 2, 2);
network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 1, 1, 1, 1, 0, 0, 50);
network.AddLayer(LayerTypes.FullyConnected, ActivationFunctions.SoftMax, 2704);
network.InitializeWeights();
modified 31Jul14 23:44pm.





Hi, your updated version still have some errors. Your input layer has the size of 65x53(x3). The first convolutional layer is correct: 6510+1=56 and 5310+1=44. The next pooling layer is also correct but I doubt a receptive field of 5x5 will give good results in a pooling layer, try a receptive field of 3x3 or 2x2. The pooling layer stride of 2x2 is correct. The second convolutional layer is correct but the second pooling layer is not: 24/2=12 and 18/2=9 (and not 8). Try this:
network.AddLayer(LayerTypes.Input, 3, 65, 53);
bool[] maps = new bool[3 * 64]= { true, ..., false};
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 56, 44, 10, 10, 1, 1, 0, 0, new Mappings(maps));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 28, 22, 3, 3, 2, 2);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 24, 18, 5, 5, 1, 1, 0, 0, new Mappings(64, 64, 66, 1));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 12, 9, 3, 3, 2, 2);
network.AddLayer(LayerTypes.Convolutional, ActivationFunctions.ReLU, 64, 8, 6, 5, 4, 1, 1, 0, 0, new Mappings(64, 64, 66, 2));
network.AddLayer(LayerTypes.StochasticPooling, ActivationFunctions.Ident, 64, 4, 3, 2, 2, 2, 2);
network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 4, 3, 1, 1, 0, 0, 50);
network.AddLayer(LayerTypes.FullyConnected, ActivationFunctions.SoftMax, 2704);
network.InitializeWeights();
You're using the SGDLevenbergMarquardt Training Strategy in conjunction with a crossentropy loss with a softmax output layer. Normally (but not always!) this will not minimize the loss function properly. You should try the SGDLevenbergMarquardtModA stategy or plain SGD if you don't get the desired results.
modified 1Aug14 16:56pm.





I have downloaded and ran setup.exe. I got an icon on the desktop and all folders and CNNWB.exe in Program Files (x86). But when I tried to start exe file nothing happened. I tried by clicking on the icon and from the directory.
What I did wrong?
Thank you.





Hi, I've uploaded a corrected version of the setup program and also the full source code.
Thanks for reporting the bug!





Yes, now it works. Thanks.
But when I start the app, it immediately begun load all CIFAR and MNIST datasets, not asking me. Why? It is the framework, and I might be a guy with my own data sets.
Also, it would not hurt if you provide XML schema you used in your framework. I might hate LeCun network and use or invent something else.
Any way, IMHO your article is good but too laconic for CP





Hi Filip,
Has the code and binaries been removed? The git repository seems emptied out except for the SLN file and the zips attached to the article are corrupted.
Thanks!





The githuib repositories are corrupted for now. You should use the download links at the top of the page to get the latest binaries and sources
Thanks,
Filip





Thanks Filip. The zip attachments to the article are available now.





The git repositories okay now.





Very nice. Something I could definitely use in the future.





Thanks!






Thank you Filip for this great project! It my first experience with neural networks and i found your app, so i'm happy=)
I'm sorry for this stupid question, but why one epoch on CIFAR10 dataset lasts 100000 sample indexes? I mean, that dasaet contains 50k pictures as i know, mb i don't understand something in CNN configurations for this task?
modified 13Jun14 9:41am.





That's no stupid question at all. I doubled the CIFAR10 dataset by horizontal flipping each image. This will double training time but gives the network more training samples and greater accuracy. If you want you can change this behaviour very easy in the DataProvider class.





Thanks for your answer! And i need your support again=) I can't understand why the speed of testing(in Testing mode) decreases(if there were 130 img per second, after minute it will be 30) and after few minutes of work programm slowing down( inteface didn't provide any reaction to my actions, but program still calculating)?







General News Suggestion Question Bug Answer Joke Praise Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

