Click here to Skip to main content
Click here to Skip to main content

Image Recognition with Neural Networks

By , 30 Oct 2007
 
Screenshot - screen211.png

Introduction

Artificial Neural Networks are a recent development tool that are modeled from biological neural networks. The powerful side of this new tool is its ability to solve problems that are very hard to be solved by traditional computing methods (e.g. by algorithms). This work briefly explains Artificial Neural Networks and their applications, describing how to implement a simple ANN for image recognition.

Background

I will try to make the idea clear to the reader who is just interested in the topic.

About Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are a new approach that follow a different way from traditional computing methods to solve problems. Since conventional computers use algorithmic approach, if the specific steps that the computer needs to follow are not known, the computer cannot solve the problem. That means, traditional computing methods can only solve the problems that we have already understood and knew how to solve. However, ANNs are, in some way, much more powerful because they can solve problems that we do not exactly know how to solve. That's why, of late, their usage is spreading over a wide range of area including, virus detection, robot control, intrusion detection systems, pattern (image, fingerprint, noise..) recognition and so on.

ANNs have the ability to adapt, learn, generalize, cluster or organize data. There are many structures of ANNs including, Percepton, Adaline, Madaline, Kohonen, BackPropagation and many others. Probably, BackPropagation ANN is the most commonly used, as it is very simple to implement and effective. In this work, we will deal with BackPropagation ANNs.

BackPropagation ANNs contain one or more layers each of which are linked to the next layer. The first layer is called the "input layer" which meets the initial input (e.g. pixels from a letter) and so does the last one "output layer" which usually holds the input's identifier (e.g. name of the input letter). The layers between input and output layers are called "hidden layer(s)" which only propagate the previous layer's outputs to the next layer and [back] propagates the following layer's error to the previous layer. Actually, these are the main operations of training a BackPropagation ANN which follows a few steps.

A typical BackPropagation ANN is as depicted below. The black nodes (on the extreme left) are the initial inputs. Training such a network involves two phases. In the first phase, the inputs are propagated forward to compute the outputs for each output node. Then, each of these outputs are subtracted from its desired output, causing an error [an error for each output node]. In the second phase, each of these output errors is passed backward and the weights are fixed. These two phases is continued until the sum of [square of output errors] reaches an acceptable value.

Screenshot - fig1_nnet_thinner.png

Implementation

The network layers in the figure above are implemented as arrays of structs. The nodes of the layers are implemented as follows:

[Serializable]
struct PreInput
{
    public double Value;
    public double[] Weights;            
};

[Serializable]
struct Input
{
    public double InputSum;                
    public double Output;                
    public double Error;                
    public double[] Weights;        
};
            
[Serializable]        
struct Hidden        
{                
    public double InputSum;                    
    public double Output;                
    public double Error;                
    public double[] Weights;        
};
            
[Serializable]        
struct Output<T> where T : IComparable<T>         
{                
    public double InputSum;                
    public double output;                
    public double Error;                
    public double Target;     
    public T Value;   
};

The layers in the figure are implemented as follows (for a three layer network):

private PreInput[] PreInputLayer;
private Input[] InputLayer;
private Hidden[] HiddenLayer;
private Output<string>[] OutputLayer;

Training the network can be summarized as follows:

  • Apply input to the network.
  • Calculate the output.
  • Compare the resulting output with the desired output for the given input. This is called the error.
  • Modify the weights for all neurons using the error.
  • Repeat the process until the error reaches an acceptable value (e.g. error < 1%), which means that the NN was trained successfully, or if we reach a maximum count of iterations, which means that the NN training was not successful.

It is represented as shown below:

void TrainNetwork(TrainingSet,MaxError)
{
     while(CurrentError>MaxError)
     {
          foreach(Pattern in TrainingSet)
          {
               ForwardPropagate(Pattern);//calculate output 
               BackPropagate()//fix errors, update weights
          }
     }
}

This is implemented as follows:

public bool Train()
{
    double currentError = 0;
    int currentIteration = 0;
    NeuralEventArgs Args = new NeuralEventArgs() ;

    do
    {
        currentError = 0;
        foreach (KeyValuePair<T, double[]> p in TrainingSet)
        {
            NeuralNet.ForwardPropagate(p.Value, p.Key);
            NeuralNet.BackPropagate();
            currentError += NeuralNet.GetError();
        }
                
        currentIteration++;
    
        if (IterationChanged != null && currentIteration % 5 == 0)
        {
            Args.CurrentError = currentError;
            Args.CurrentIteration = currentIteration;
            IterationChanged(this, Args);
        }

    } while (currentError > maximumError && currentIteration < 
    maximumIteration && !Args.Stop);

    if (IterationChanged != null)
    {
        Args.CurrentError = currentError;
        Args.CurrentIteration = currentIteration;
        IterationChanged(this, Args);
    }

    if (currentIteration >= maximumIteration || Args.Stop)   
        return false;//Training Not Successful
            
    return true;
}

Where ForwardPropagate(..) and BackPropagate() methods are as shown for a three layer network:

private void ForwardPropagate(double[] pattern, T output)
{
    int i, j;
    double total;
    //Apply input to the network
    for (i = 0; i < PreInputNum; i++)
    {
        PreInputLayer[i].Value = pattern[i];
    }
    //Calculate The First(Input) Layer's Inputs and Outputs
    for (i = 0; i < InputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < PreInputNum; j++)
        {
            total += PreInputLayer[j].Value * PreInputLayer[j].Weights[i];
        }
        InputLayer[i].InputSum = total;
        InputLayer[i].Output = F(total);
    }
    //Calculate The Second(Hidden) Layer's Inputs and Outputs
    for (i = 0; i < HiddenNum; i++)
    {
        total = 0.0;
        for (j = 0; j < InputNum; j++)
        {
            total += InputLayer[j].Output * InputLayer[j].Weights[i];
        }

        HiddenLayer[i].InputSum = total;
        HiddenLayer[i].Output = F(total);
    }
    //Calculate The Third(Output) Layer's Inputs, Outputs, Targets and Errors
    for (i = 0; i < OutputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < HiddenNum; j++)
        {
            total += HiddenLayer[j].Output * HiddenLayer[j].Weights[i];
        }

        OutputLayer[i].InputSum = total;
        OutputLayer[i].output = F(total);
        OutputLayer[i].Target = OutputLayer[i].Value.CompareTo(output) == 0 ? 1.0 : 0.0;
        OutputLayer[i].Error = (OutputLayer[i].Target - OutputLayer[i].output) *
                                       (OutputLayer[i].output) * (1 - OutputLayer[i].output);
        }
    }        
    
private void BackPropagate()
{
    int i, j;
    double total;
    //Fix Hidden Layer's Error
    for (i = 0; i < HiddenNum; i++)
    {
        total = 0.0;
        for (j = 0; j < OutputNum; j++)
        {
            total += HiddenLayer[i].Weights[j] * OutputLayer[j].Error;
        }
        HiddenLayer[i].Error = total;
    }
    //Fix Input Layer's Error
    for (i = 0; i < InputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < HiddenNum; j++)
        {
            total += InputLayer[i].Weights[j] * HiddenLayer[j].Error;
        }
        InputLayer[i].Error = total;
    }
    //Update The First Layer's Weights
    for (i = 0; i < InputNum; i++)
    {
        for(j = 0; j < PreInputNum; j++)
        {
            PreInputLayer[j].Weights[i] +=
                LearningRate * InputLayer[i].Error * PreInputLayer[j].Value;
        }
    }
    //Update The Second Layer's Weights
    for (i = 0; i < HiddenNum; i++)
    {
        for (j = 0; j < InputNum; j++)
        {
            InputLayer[j].Weights[i] +=
                LearningRate * HiddenLayer[i].Error * InputLayer[j].Output;
        }
    }
    //Update The Third Layer's Weights
    for (i = 0; i < OutputNum; i++)
    {
        for (j = 0; j < HiddenNum; j++)
        {
            HiddenLayer[j].Weights[i] +=
                LearningRate * OutputLayer[i].Error * HiddenLayer[j].Output;
        }
    }
}

Testing the App

The program trains the network using bitmap images that are located in a folder. This folder must be in the following format:

  • There must be one (input) folder that contains input images [*.bmp].
  • Each image's name is the target (or output) value for the network (the pixel values of the image are the inputs, of course) .

As testing the classes requires to train the network first, there must be a folder in this format. "PATTERNS" and "ICONS" folders [depicted below] in the Debug folder fit this format.

Screenshot - fig2_sampleInput_thinner.png Screenshot - fig3_sampleInput_thinner.png

History

  • 30th September, 2007: Simplified the app
  • 24th June, 2007: Initial Release

References & External Links

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Murat Firat
Software Developer (Senior)
Turkey Turkey
Member
Has BS degree on CS, working as SW engineer at istanbul.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Generalgreat work... :)memberkireina_301229 Jan '11 - 10:00 
hi.. im newbie in C#,,
 
your code is really brilliant...
thank u for sharing your code with us...
 
i have 1 question...
 
as i know,
we just only found 2 highest outputs on your program..
how to find the (five) highest outputs??
 
thank u for your response... Smile | :)
QuestionRe: great work... :)memberBikash_coder30 Jan '11 - 0:17 
hi
can u tell me from where .11 and .33 come here. it is located in NeuralDemo.cs at  
private void InitializeSettings()
{
textBoxInputUnit.Text = ((int)((double)(networkInput + NumOfPatterns) * .33)).ToString();
textBoxHiddenUnit.Text = ((int)((double)(networkInput + NumOfPatterns) * .11)).ToString();
 
}
 
thanks in advance
Bikash
AnswerRe: great work... :)memberkireina_30121 Feb '11 - 7:59 
i don't know...
but murat said nothing special about that. just hidden layers' node number should have the values between input and output layers'..
 

i thought 33 is average height of pattern...but, i don't think so.. D'Oh! | :doh:
 
do u have any idea?? Sigh | :sigh:
GeneralRe: great work... :)memberMurat Firat2 Feb '11 - 9:42 
thanks Smile | :)
 
if there are 36 different patterns, then up to 36 [highest] output can be determined.
 
just need to modify the following part:
 
 
public void Recognize(double[] Input, ref T MatchedHigh, ref double OutputValueHight,
            ref T MatchedLow, ref double OutputValueLow) 
 
....
 
 //Find the [Two] Highest Outputs <--not two in this case
            for (i = 0; i < OutputNum; i++)
...
//modify here to find xx highest outputs...

...
 if (OutputLayer[i].output > max)//<-- standart find two highest value procedure, this must be changed too
                {
                    MatchedLow = MatchedHigh;
                    OutputValueLow = max;
                    max = OutputLayer[i].output;
                    MatchedHigh = OutputLayer[i].Value;
                    OutputValueHight = max;
                }
...
 
good luck,
 
Murat.
GeneralTarget outputmemberSeasickSailor20 Dec '10 - 2:14 
Apologies if this is a silly question, I'm not an expert!
 
How is the target output defined in the programme during training? As things stand, there is the input character (from the database or drawn). How does the programme 'know' the desired output?
 
Thanks in advance and pardon my potential silliness! Smile | :)
GeneralDisplaying the image [modified]memberswathi65898 Oct '10 - 2:56 
I went through your article its good.
 
I got many queries like when we browse and select a pattern the image is displayed in display panel. My query is, how it is going to display the pattern into panel

modified on Wednesday, October 13, 2010 7:45 AM

GeneralFinger print Reconizationmemberwaqas munim27 Jul '10 - 7:02 
hi dear please need ur little help please kindly tell me i am working on fingerprint scanner i am able to get image from it but not able to compare them . Is this form capable to match and tell me the result of matching?? kindly reply i need ur help ?
Generalblood cell images recognition [modified]memberkushagra.thakur11 Jul '10 - 16:36 
Hi Murat,
 
First of all, thanks for the code.
I am a fourth year student in Comp. Science. I trying to use your code for recognizing blood cell images. I kept about 5-6 different blood cell images in the patterns folder and trained the network(I have extracted the images from the following pic..
http://library.med.utah.edu/WebPath/jpeg5/HEME100.jpg[^] ).
Then i browse 1 of the above extracted images and try to recognize it. It gave error value 15 and the iteration value kept on increasing till 10000 without reducing the error. Why is this happening? Are there any size constraint or because they are too similar??
Please let me know some way to distinctly recognize these cells...
 
Waiting eagerly for your reply.
 
Regards,
Kushagra

modified on Sunday, July 11, 2010 10:49 PM

GeneralRe: blood cell images recognitionmemberMurat Firat12 Jul '10 - 8:00 
Hi Kushagra,
 
As I see there are two different patterns at the image. However, if 5-6 patterns are used on training, the result may not be sufficient. Anyway, only 2 training images can be used to identify two patterns on this implementation (You can change the implementation to accept more that one training image for a pattern).
 
Also, the picture got me remember kohonen's SOM, and [my opinion] this kind of problem is much more resolvable by using self organizing maps
 
Good Luck,
Murat
Questionquestionmemberdiedou21 Jun '10 - 3:41 
i'm want ask for this project.. i'm sorry i'm newbie..
 
1. where i can find or see result of "training" during the training data is calculated??
2. when i click "load network" is an error?

neuralNetwork.LoadNetwork(FD.FileName);
error message : Object reference not set to an instance of an object.
 
what thats mean??
 
thanks..
AnswerRe: questionmemberMurat Firat23 Jun '10 - 3:04 
1- In front of "Current Error" label
2- I am not exactly sure about the error, just run the program in debug mode (by F5) and then you can see the exact error (app is fully open source)
 
hope those helps..
 
good luck
Murat
QuestionNetwork...?membermimi251313 Jun '10 - 6:36 
Hello! Thank you for your helpful program. Smile | :)
 
I have a few questions about this program.
 
1. In this program, there are 2 buttons - 'save network' button and 'load network' button.
 
what are these buttons? When I use your program, I don't need to click these buttons to operate the program.
 
And.... When I click the 'load network' button, I could load 'sample.net' file. but, there's no change in program. what these buttons are used for?
 

2. As you said in source code, to modify initial weight, 'x' and 'y' can be changed. But I don't know the meaning of these variables. Is there any meaning or standard(sth like that) to determine these variable?
AnswerRe: Network...?memberMurat Firat14 Jun '10 - 20:11 
save network serializes trained network weights to a binary file, likewise load network operation loads network weights from a serialized binary file. They are just because to store trained network weights into a file in order not to make it necessary to train the network with the same parameters again.
 
2nd quest is related to initial weights on backpropagation network and there is not a certain rule about this. As I remember they were affecting duration of training process (gained experimentally)..
QuestionTraining imagememberMinju8713 Jun '10 - 4:05 
Thank you for providing so useful things. Smile | :)
I heard that neural network uses training for find most similar image. So I tried to find similar image with simple tree images. Its' size is not that big and grayscale images which is very simple.
But it does not find well. Sigh | :sigh:
Does your program have a code which do training?
or has already trained with another ways that I don't know?
AnswerRe: Training imagememberMurat Firat14 Jun '10 - 19:55 
The program itself includes a training algorithm under "lib" directory. That if the images are not recognized can be because images are so similar or the network is not trained enough. Also each training image must state a different pattern (e.g. letter)
Generalhight and low? [modified]memberyeah100028 Apr '10 - 7:43 
Whot do these hight and low mean?
 
when i draw a 'S' in the picturebox, Hight shows me 'G' and Low shows me '6'. Why is that so?
 
How long and with what settings should i train it to display accurate results?

modified on Wednesday, April 28, 2010 1:54 PM

GeneralRe: hight and low?memberMurat Firat29 Apr '10 - 3:05 
High (output) is the best picked character and low (output) is the second matched one.
The letters that have common features may not be recognized correctly and that is normal.
QuestionTrain 2 by 2 and Then Join the .Net file ?membersubsari1222 Apr '10 - 21:40 
First off: Great Work! Greatly appreciated, have learned a whole lot and has been tremendously useful on my personal project.
 
1. Would it be plausible to pause training and save and later on resume ?
2. Would it be possible to train by only training 2 images but with great nodes and hidden layers, and then mix the .net training file ?
3. Would it be better to train only the skeleton of the pattern or include color and other partial content of the image.
4. Final thing Poke tongue | ;-P How could I increase the accuracy for nearly 2 very similar but only different by small details images.
 
Thanks ! I know these are many questions, but if you can I would appreciate any thoughts or short comments!
 
THANKS! Big Grin | :-D
AnswerRe: Train 2 by 2 and Then Join the .Net file ?memberMurat Firat25 Apr '10 - 11:56 
Thanks, here are my comments regarding to the questions
 
1- it would be; saving nnet just means saving the weights
2- it would not be; loading [saved] nnet file destroys previously trained weights. (firstly loading nnet file then training loaded weights is possible)
3- it would not be; input layer ignores any dimensional attribute like color, the standart input format is grayscale (colored images are converted to grayscale)
4- if 2 images are completely similar, there may not be a special&certain way to distinguish them
 
I hope those helps. good luck..
GeneralRe: Train 2 by 2 and Then Join the .Net file ?membersubsari1230 Apr '10 - 4:12 
Thanks, This actually did help, I was able to achieve my desired goal.
 
Again, thank you so much for you're contribution. Roll eyes | :rolleyes: Big Grin | :-D
Generalgreat workmemberMikant8 Feb '10 - 10:51 
thank you, Murat for providing people with such a great code. your code organisation is perfect (simple and powerful)
GeneralRe: great workmemberMurat Firat9 Feb '10 - 2:02 
Thanks buddy, You are welcome Smile | :)
Generaladditionmemberrasleen_136 Dec '09 - 5:04 
n also what do the iterations signify ?
Questionhigh n low ?memberrasleen_136 Dec '09 - 5:00 
it's a great application
thanks a ton for providing us with it
but i have a question:
since i am new to neural networks...i tried to understand the working with the help of the discussions...but i still have a doubt:
what do high n low signify during pattern matching ?
AnswerRe: high n low ?memberMurat Firat6 Dec '09 - 23:08 
rasleen_13 wrote:
what do high n low signify during pattern matching ?

 
thanks but unfortunately I couldn't understand the questionConfused | :confused:

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 30 Oct 2007
Article Copyright 2007 by Murat Firat
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid