Click here to Skip to main content
Click here to Skip to main content

Image Recognition with Neural Networks

By , 30 Oct 2007
 
Screenshot - screen211.png

Introduction

Artificial Neural Networks are a recent development tool that are modeled from biological neural networks. The powerful side of this new tool is its ability to solve problems that are very hard to be solved by traditional computing methods (e.g. by algorithms). This work briefly explains Artificial Neural Networks and their applications, describing how to implement a simple ANN for image recognition.

Background

I will try to make the idea clear to the reader who is just interested in the topic.

About Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are a new approach that follow a different way from traditional computing methods to solve problems. Since conventional computers use algorithmic approach, if the specific steps that the computer needs to follow are not known, the computer cannot solve the problem. That means, traditional computing methods can only solve the problems that we have already understood and knew how to solve. However, ANNs are, in some way, much more powerful because they can solve problems that we do not exactly know how to solve. That's why, of late, their usage is spreading over a wide range of area including, virus detection, robot control, intrusion detection systems, pattern (image, fingerprint, noise..) recognition and so on.

ANNs have the ability to adapt, learn, generalize, cluster or organize data. There are many structures of ANNs including, Percepton, Adaline, Madaline, Kohonen, BackPropagation and many others. Probably, BackPropagation ANN is the most commonly used, as it is very simple to implement and effective. In this work, we will deal with BackPropagation ANNs.

BackPropagation ANNs contain one or more layers each of which are linked to the next layer. The first layer is called the "input layer" which meets the initial input (e.g. pixels from a letter) and so does the last one "output layer" which usually holds the input's identifier (e.g. name of the input letter). The layers between input and output layers are called "hidden layer(s)" which only propagate the previous layer's outputs to the next layer and [back] propagates the following layer's error to the previous layer. Actually, these are the main operations of training a BackPropagation ANN which follows a few steps.

A typical BackPropagation ANN is as depicted below. The black nodes (on the extreme left) are the initial inputs. Training such a network involves two phases. In the first phase, the inputs are propagated forward to compute the outputs for each output node. Then, each of these outputs are subtracted from its desired output, causing an error [an error for each output node]. In the second phase, each of these output errors is passed backward and the weights are fixed. These two phases is continued until the sum of [square of output errors] reaches an acceptable value.

Screenshot - fig1_nnet_thinner.png

Implementation

The network layers in the figure above are implemented as arrays of structs. The nodes of the layers are implemented as follows:

[Serializable]
struct PreInput
{
    public double Value;
    public double[] Weights;            
};

[Serializable]
struct Input
{
    public double InputSum;                
    public double Output;                
    public double Error;                
    public double[] Weights;        
};
            
[Serializable]        
struct Hidden        
{                
    public double InputSum;                    
    public double Output;                
    public double Error;                
    public double[] Weights;        
};
            
[Serializable]        
struct Output<T> where T : IComparable<T>         
{                
    public double InputSum;                
    public double output;                
    public double Error;                
    public double Target;     
    public T Value;   
};

The layers in the figure are implemented as follows (for a three layer network):

private PreInput[] PreInputLayer;
private Input[] InputLayer;
private Hidden[] HiddenLayer;
private Output<string>[] OutputLayer;

Training the network can be summarized as follows:

  • Apply input to the network.
  • Calculate the output.
  • Compare the resulting output with the desired output for the given input. This is called the error.
  • Modify the weights for all neurons using the error.
  • Repeat the process until the error reaches an acceptable value (e.g. error < 1%), which means that the NN was trained successfully, or if we reach a maximum count of iterations, which means that the NN training was not successful.

It is represented as shown below:

void TrainNetwork(TrainingSet,MaxError)
{
     while(CurrentError>MaxError)
     {
          foreach(Pattern in TrainingSet)
          {
               ForwardPropagate(Pattern);//calculate output 
               BackPropagate()//fix errors, update weights
          }
     }
}

This is implemented as follows:

public bool Train()
{
    double currentError = 0;
    int currentIteration = 0;
    NeuralEventArgs Args = new NeuralEventArgs() ;

    do
    {
        currentError = 0;
        foreach (KeyValuePair<T, double[]> p in TrainingSet)
        {
            NeuralNet.ForwardPropagate(p.Value, p.Key);
            NeuralNet.BackPropagate();
            currentError += NeuralNet.GetError();
        }
                
        currentIteration++;
    
        if (IterationChanged != null && currentIteration % 5 == 0)
        {
            Args.CurrentError = currentError;
            Args.CurrentIteration = currentIteration;
            IterationChanged(this, Args);
        }

    } while (currentError > maximumError && currentIteration < 
    maximumIteration && !Args.Stop);

    if (IterationChanged != null)
    {
        Args.CurrentError = currentError;
        Args.CurrentIteration = currentIteration;
        IterationChanged(this, Args);
    }

    if (currentIteration >= maximumIteration || Args.Stop)   
        return false;//Training Not Successful
            
    return true;
}

Where ForwardPropagate(..) and BackPropagate() methods are as shown for a three layer network:

private void ForwardPropagate(double[] pattern, T output)
{
    int i, j;
    double total;
    //Apply input to the network
    for (i = 0; i < PreInputNum; i++)
    {
        PreInputLayer[i].Value = pattern[i];
    }
    //Calculate The First(Input) Layer's Inputs and Outputs
    for (i = 0; i < InputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < PreInputNum; j++)
        {
            total += PreInputLayer[j].Value * PreInputLayer[j].Weights[i];
        }
        InputLayer[i].InputSum = total;
        InputLayer[i].Output = F(total);
    }
    //Calculate The Second(Hidden) Layer's Inputs and Outputs
    for (i = 0; i < HiddenNum; i++)
    {
        total = 0.0;
        for (j = 0; j < InputNum; j++)
        {
            total += InputLayer[j].Output * InputLayer[j].Weights[i];
        }

        HiddenLayer[i].InputSum = total;
        HiddenLayer[i].Output = F(total);
    }
    //Calculate The Third(Output) Layer's Inputs, Outputs, Targets and Errors
    for (i = 0; i < OutputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < HiddenNum; j++)
        {
            total += HiddenLayer[j].Output * HiddenLayer[j].Weights[i];
        }

        OutputLayer[i].InputSum = total;
        OutputLayer[i].output = F(total);
        OutputLayer[i].Target = OutputLayer[i].Value.CompareTo(output) == 0 ? 1.0 : 0.0;
        OutputLayer[i].Error = (OutputLayer[i].Target - OutputLayer[i].output) *
                                       (OutputLayer[i].output) * (1 - OutputLayer[i].output);
        }
    }        
    
private void BackPropagate()
{
    int i, j;
    double total;
    //Fix Hidden Layer's Error
    for (i = 0; i < HiddenNum; i++)
    {
        total = 0.0;
        for (j = 0; j < OutputNum; j++)
        {
            total += HiddenLayer[i].Weights[j] * OutputLayer[j].Error;
        }
        HiddenLayer[i].Error = total;
    }
    //Fix Input Layer's Error
    for (i = 0; i < InputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < HiddenNum; j++)
        {
            total += InputLayer[i].Weights[j] * HiddenLayer[j].Error;
        }
        InputLayer[i].Error = total;
    }
    //Update The First Layer's Weights
    for (i = 0; i < InputNum; i++)
    {
        for(j = 0; j < PreInputNum; j++)
        {
            PreInputLayer[j].Weights[i] +=
                LearningRate * InputLayer[i].Error * PreInputLayer[j].Value;
        }
    }
    //Update The Second Layer's Weights
    for (i = 0; i < HiddenNum; i++)
    {
        for (j = 0; j < InputNum; j++)
        {
            InputLayer[j].Weights[i] +=
                LearningRate * HiddenLayer[i].Error * InputLayer[j].Output;
        }
    }
    //Update The Third Layer's Weights
    for (i = 0; i < OutputNum; i++)
    {
        for (j = 0; j < HiddenNum; j++)
        {
            HiddenLayer[j].Weights[i] +=
                LearningRate * OutputLayer[i].Error * HiddenLayer[j].Output;
        }
    }
}

Testing the App

The program trains the network using bitmap images that are located in a folder. This folder must be in the following format:

  • There must be one (input) folder that contains input images [*.bmp].
  • Each image's name is the target (or output) value for the network (the pixel values of the image are the inputs, of course) .

As testing the classes requires to train the network first, there must be a folder in this format. "PATTERNS" and "ICONS" folders [depicted below] in the Debug folder fit this format.

Screenshot - fig2_sampleInput_thinner.png Screenshot - fig3_sampleInput_thinner.png

History

  • 30th September, 2007: Simplified the app
  • 24th June, 2007: Initial Release

References & External Links

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Murat Firat
Software Developer (Senior)
Turkey Turkey
Member
Has BS degree on CS, working as SW engineer at istanbul.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 1memberMAN2MAN24 Dec '12 - 23:49 
not accurate
Questionhow big my images?memberamin_nmer1 Dec '12 - 22:58 
hi. murat. thank you for sample.
some image not recognize why do i do?
how big were my images in pattern?
how many neron are there?
please send me document?
amin_fathi65@yahoo.com
thank you?
GeneralMy vote of 4memberomyildirim30 Nov '12 - 2:39 
The application is good but not so clear
Bug[IMPORTANT] The implementation contains MAJOR BUGSmembergiuseppemag27 Jun '12 - 23:32 
The formula for the output layer errors is wrong, since it is just -(target-output) without other terms, while the formula found in the code is the error for the internal layers.
 
Weights also are updated wrongly, since the correct formula also requires the derivative of the neuron input; the update sign is also completely wrong.
 
Please DO NOT BOTHER WITH THIS IMPLEMENTATION!
GeneralRe: [IMPORTANT] The implementation contains MAJOR BUGSmembersachitha123455 Nov '12 - 20:26 
how to solve that problem..?? pls help me ..
Questionmatching algorithmmemberMember 813822721 Apr '12 - 19:19 
Hi Murat,
Which matching algorithm you used in this project and in which file it is written?
GeneralMy vote of 5memberpraneshkmr24 Jan '12 - 18:45 
Nice one man... keep it up..
QuestionPlease help !memberhovu20 Jan '12 - 4:40 
what image's feature did you use to train the network?
QuestionEntering and outputting more than one charactermemberibnkhaldun2 Jan '12 - 10:41 
Hey Murat, I am trying to change your application where instead of one character it can take 6 or 7 characters and than detected one by one and display the result, can you tell me where to go about it
Thanks
Questioninput value not the same as output valuememberMember 39132833 Dec '11 - 5:22 
HI Murat , awesome work i have been learning about ANN for a while now and your example is brilliant and easy to undestand.
 
but i have one question i drew a letter like "A" in the drawing area and tried to recognise it but my output value was not the same as my input value but before i did this I did set the layers to 3 for better recognition and let it Train but still no luck.
 
I am aware that it will train its self but how do i get it to give me the right ouput, or do i not understand something, Thanks
GeneralMy vote of 3memberHansiHermann28 Oct '11 - 3:12 
explanation too superficial
Questionflow chartmemberbaguswahyu27 Oct '11 - 2:38 
Hi Mr Firat
Your program & article really caught my attention.. Thumbs Up | :thumbsup:
Can you show me on the flow chart of this program?
QuestionLogic behind the programmmemberMember 820378213 Sep '11 - 20:35 
Hi Murat,
 
It will be great if you can explain by taken one Input and explain how it is converted into matrix and then how it is applied as Input to neuron and how the Output is calculated.
 
And also how is it used to recognize the pattern.Also if you can explain the co-relation between the modules it will be of great help.
Questionlittle confused about PreInput and Inputmemberswarajs7 Aug '11 - 18:05 
Hello,
 
Nice article! However I have a small confusion: Isn't "PreInput Layer" in the code is actually the Input layer and "Input Layer" is actually a Hidden Layer?
 
Regards,
Swaraj
QuestionGreat Work:-)memberchikkisherry20 Jul '11 - 20:56 
this source is recognizing .bmp images.. can you please help me with a application which recognises .jpg images
QuestionIncrease probabilitymemberRonZohan20 Jul '11 - 3:30 
How can i increase the probability of getting the desired output of the image? i have been training lately but the outcome wouldn't go as high as 50%
QuestionProblem with self codingmemberspider8539 Jul '11 - 15:10 
Here is my code
 
http://ideone.com/did7b
 
I've tried to write a simple NN for my self using XOR training set
but it get stucked at 0.5
 
What I'm doing wrong?
 
Thanks
QuestionNeed your guidance...memberapepe23 Jun '11 - 16:55 
Hi Murat Firat
 
How to show three or more matching images, include the percentages of each(total 100%)?not only two images.
In your program just "MatchedHigh" and "MatchedLow" with the percentages.
I want to try for image recognition and then show the similar image(at least 3 images).
 
thank you
AnswerRe: Need your guidance...memberMurat Firat24 Jun '11 - 8:56 
Hi,
 
At the implementation of Recognize method, there is a code segment that finds two best matches:
 
 
            //Find the [Two] Highest Outputs   
            for (i = 0; i < OutputNum; i++)
            {
                total = 0.0;
                for (j = 0; j < InputNum; j++)
                {
                    total += InputLayer[j].Output * InputLayer[j].Weights[i];
                }
                OutputLayer[i].InputSum = total;
                OutputLayer[i].output = F(total);
                if (OutputLayer[i].output > max)
                {
                    MatchedLow = MatchedHigh;
                    OutputValueLow = max;
                    max = OutputLayer[i].output;
                    MatchedHigh = OutputLayer[i].Value;
                    OutputValueHight = max;
                }
            }
 
 
At the end of this segment, OutputLayer array contains all matches (OutputLayer[..].value) with match percentages (OutputLayer[..].output) for all different patterns. Then, by returning OutputLayer array at method, even all images can be shown..
 
hope that helps..
GeneralRe: Need your guidance...memberapepe25 Jun '11 - 16:32 
Thank you for your quick response Murat.
I am still don't understand, i am beginner in C#.
Could you give me specific syntax, for example i want to show three similar images (ex: Matched1, Matched2, Matched3 will shown in the "picturebox" and the total percentages = 100%)
 
if (OutputLayer[i].output > max)
{
what i should write here?
}
 
Thank you again for your guidance.
best regards
apepe
GeneralRe: Need your guidance...memberMurat Firat27 Jun '11 - 0:51 
Well, there is an array that contains matched percentages; think about getting three highest percentages (instead of two at the code). Now, at this point, developing a simple algorithm will also provide some experience on C#! Smile | :)
 
good luck Wink | ;)
Questionhelp please!!!!!memberasmita529 May '11 - 7:10 
Hi Mr Firat,
 
Your work is very interesting, anyway thank you very much for uploading this project.
It is helping me a lot in my project but I'm doing with the library AForge.Can you indicate me to some links on how to do with that library.Thank you very much.
AnswerRe: help please!!!!!memberBikash_coder15 Jun '11 - 23:09 
what you want to do actually?
recently i develop a project using this(C# Implementation of BackPropagation Neural Network For Pattern Recognition) and Aforg.net
can you clear it to us....
AnswerRe: help please!!!!!memberasmita516 Jun '11 - 1:48 
I´m also developing a project for my end of carrer with c# about Caracter recognition with Backpropagation neural network and AForge.net
I need information about this library.
 
Thank you very much for your help!!!!!!!!!!!!
GeneralRe: help please!!!!!memberMurat Firat16 Jun '11 - 2:17 
if you are not handling a commercial project, [do not re-invent the wheel] use an ocr engine like MODI or Tesseract or etc..

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 30 Oct 2007
Article Copyright 2007 by Murat Firat
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid