Click here to Skip to main content
11,412,516 members (74,076 online)
Click here to Skip to main content

Image Recognition with Neural Networks

, 30 Oct 2007 CPOL
Rate this:
Please Sign up or sign in to vote.
This article contains a brief description of BackPropagation Artificial Neural Network and its implementation for Image Recognition
Screenshot - screen211.png

Introduction

Artificial Neural Networks are a recent development tool that are modeled from biological neural networks. The powerful side of this new tool is its ability to solve problems that are very hard to be solved by traditional computing methods (e.g. by algorithms). This work briefly explains Artificial Neural Networks and their applications, describing how to implement a simple ANN for image recognition.

Background

I will try to make the idea clear to the reader who is just interested in the topic.

About Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are a new approach that follow a different way from traditional computing methods to solve problems. Since conventional computers use algorithmic approach, if the specific steps that the computer needs to follow are not known, the computer cannot solve the problem. That means, traditional computing methods can only solve the problems that we have already understood and knew how to solve. However, ANNs are, in some way, much more powerful because they can solve problems that we do not exactly know how to solve. That's why, of late, their usage is spreading over a wide range of area including, virus detection, robot control, intrusion detection systems, pattern (image, fingerprint, noise..) recognition and so on.

ANNs have the ability to adapt, learn, generalize, cluster or organize data. There are many structures of ANNs including, Percepton, Adaline, Madaline, Kohonen, BackPropagation and many others. Probably, BackPropagation ANN is the most commonly used, as it is very simple to implement and effective. In this work, we will deal with BackPropagation ANNs.

BackPropagation ANNs contain one or more layers each of which are linked to the next layer. The first layer is called the "input layer" which meets the initial input (e.g. pixels from a letter) and so does the last one "output layer" which usually holds the input's identifier (e.g. name of the input letter). The layers between input and output layers are called "hidden layer(s)" which only propagate the previous layer's outputs to the next layer and [back] propagates the following layer's error to the previous layer. Actually, these are the main operations of training a BackPropagation ANN which follows a few steps.

A typical BackPropagation ANN is as depicted below. The black nodes (on the extreme left) are the initial inputs. Training such a network involves two phases. In the first phase, the inputs are propagated forward to compute the outputs for each output node. Then, each of these outputs are subtracted from its desired output, causing an error [an error for each output node]. In the second phase, each of these output errors is passed backward and the weights are fixed. These two phases is continued until the sum of [square of output errors] reaches an acceptable value.

Screenshot - fig1_nnet_thinner.png

Implementation

The network layers in the figure above are implemented as arrays of structs. The nodes of the layers are implemented as follows:

[Serializable]
struct PreInput
{
    public double Value;
    public double[] Weights;            
};

[Serializable]
struct Input
{
    public double InputSum;                
    public double Output;                
    public double Error;                
    public double[] Weights;        
};
            
[Serializable]        
struct Hidden        
{                
    public double InputSum;                    
    public double Output;                
    public double Error;                
    public double[] Weights;        
};
            
[Serializable]        
struct Output<T> where T : IComparable<T>         
{                
    public double InputSum;                
    public double output;                
    public double Error;                
    public double Target;     
    public T Value;   
};

The layers in the figure are implemented as follows (for a three layer network):

private PreInput[] PreInputLayer;
private Input[] InputLayer;
private Hidden[] HiddenLayer;
private Output<string>[] OutputLayer;

Training the network can be summarized as follows:

  • Apply input to the network.
  • Calculate the output.
  • Compare the resulting output with the desired output for the given input. This is called the error.
  • Modify the weights for all neurons using the error.
  • Repeat the process until the error reaches an acceptable value (e.g. error < 1%), which means that the NN was trained successfully, or if we reach a maximum count of iterations, which means that the NN training was not successful.

It is represented as shown below:

void TrainNetwork(TrainingSet,MaxError)
{
     while(CurrentError>MaxError)
     {
          foreach(Pattern in TrainingSet)
          {
               ForwardPropagate(Pattern);//calculate output 
               BackPropagate()//fix errors, update weights
          }
     }
}

This is implemented as follows:

public bool Train()
{
    double currentError = 0;
    int currentIteration = 0;
    NeuralEventArgs Args = new NeuralEventArgs() ;

    do
    {
        currentError = 0;
        foreach (KeyValuePair<T, double[]> p in TrainingSet)
        {
            NeuralNet.ForwardPropagate(p.Value, p.Key);
            NeuralNet.BackPropagate();
            currentError += NeuralNet.GetError();
        }
                
        currentIteration++;
    
        if (IterationChanged != null && currentIteration % 5 == 0)
        {
            Args.CurrentError = currentError;
            Args.CurrentIteration = currentIteration;
            IterationChanged(this, Args);
        }

    } while (currentError > maximumError && currentIteration < 
    maximumIteration && !Args.Stop);

    if (IterationChanged != null)
    {
        Args.CurrentError = currentError;
        Args.CurrentIteration = currentIteration;
        IterationChanged(this, Args);
    }

    if (currentIteration >= maximumIteration || Args.Stop)   
        return false;//Training Not Successful
            
    return true;
}

Where ForwardPropagate(..) and BackPropagate() methods are as shown for a three layer network:

private void ForwardPropagate(double[] pattern, T output)
{
    int i, j;
    double total;
    //Apply input to the network
    for (i = 0; i < PreInputNum; i++)
    {
        PreInputLayer[i].Value = pattern[i];
    }
    //Calculate The First(Input) Layer's Inputs and Outputs
    for (i = 0; i < InputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < PreInputNum; j++)
        {
            total += PreInputLayer[j].Value * PreInputLayer[j].Weights[i];
        }
        InputLayer[i].InputSum = total;
        InputLayer[i].Output = F(total);
    }
    //Calculate The Second(Hidden) Layer's Inputs and Outputs
    for (i = 0; i < HiddenNum; i++)
    {
        total = 0.0;
        for (j = 0; j < InputNum; j++)
        {
            total += InputLayer[j].Output * InputLayer[j].Weights[i];
        }

        HiddenLayer[i].InputSum = total;
        HiddenLayer[i].Output = F(total);
    }
    //Calculate The Third(Output) Layer's Inputs, Outputs, Targets and Errors
    for (i = 0; i < OutputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < HiddenNum; j++)
        {
            total += HiddenLayer[j].Output * HiddenLayer[j].Weights[i];
        }

        OutputLayer[i].InputSum = total;
        OutputLayer[i].output = F(total);
        OutputLayer[i].Target = OutputLayer[i].Value.CompareTo(output) == 0 ? 1.0 : 0.0;
        OutputLayer[i].Error = (OutputLayer[i].Target - OutputLayer[i].output) *
                                       (OutputLayer[i].output) * (1 - OutputLayer[i].output);
        }
    }        
    
private void BackPropagate()
{
    int i, j;
    double total;
    //Fix Hidden Layer's Error
    for (i = 0; i < HiddenNum; i++)
    {
        total = 0.0;
        for (j = 0; j < OutputNum; j++)
        {
            total += HiddenLayer[i].Weights[j] * OutputLayer[j].Error;
        }
        HiddenLayer[i].Error = total;
    }
    //Fix Input Layer's Error
    for (i = 0; i < InputNum; i++)
    {
        total = 0.0;
        for (j = 0; j < HiddenNum; j++)
        {
            total += InputLayer[i].Weights[j] * HiddenLayer[j].Error;
        }
        InputLayer[i].Error = total;
    }
    //Update The First Layer's Weights
    for (i = 0; i < InputNum; i++)
    {
        for(j = 0; j < PreInputNum; j++)
        {
            PreInputLayer[j].Weights[i] +=
                LearningRate * InputLayer[i].Error * PreInputLayer[j].Value;
        }
    }
    //Update The Second Layer's Weights
    for (i = 0; i < HiddenNum; i++)
    {
        for (j = 0; j < InputNum; j++)
        {
            InputLayer[j].Weights[i] +=
                LearningRate * HiddenLayer[i].Error * InputLayer[j].Output;
        }
    }
    //Update The Third Layer's Weights
    for (i = 0; i < OutputNum; i++)
    {
        for (j = 0; j < HiddenNum; j++)
        {
            HiddenLayer[j].Weights[i] +=
                LearningRate * OutputLayer[i].Error * HiddenLayer[j].Output;
        }
    }
}

Testing the App

The program trains the network using bitmap images that are located in a folder. This folder must be in the following format:

  • There must be one (input) folder that contains input images [*.bmp].
  • Each image's name is the target (or output) value for the network (the pixel values of the image are the inputs, of course) .

As testing the classes requires to train the network first, there must be a folder in this format. "PATTERNS" and "ICONS" folders [depicted below] in the Debug folder fit this format.

Screenshot - fig2_sampleInput_thinner.png Screenshot - fig3_sampleInput_thinner.png

History

  • 30th September, 2007: Simplified the app
  • 24th June, 2007: Initial Release

References & External Links

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Murat Firat
Software Developer (Senior)
Turkey Turkey
Has BS degree on CS, working as SW engineer at istanbul.

Comments and Discussions

 
QuestionAsk [file trainning] Pin
Member 10648205 at 14-Jan-15 21:59
memberMember 1064820514-Jan-15 21:59 
GeneralMy vote of 3 Pin
asgharmalik at 29-Oct-14 1:29
memberasgharmalik29-Oct-14 1:29 
QuestionI can't run the programm Pin
Member 11023549 at 19-Aug-14 22:22
memberMember 1102354919-Aug-14 22:22 
Generalmany thanks for this topic Pin
rami_ka at 7-Apr-14 13:57
memberrami_ka7-Apr-14 13:57 
Questionemotional expressions Pin
Member 9538830 at 5-Feb-14 10:11
memberMember 95388305-Feb-14 10:11 
QuestionOutput and Input Neuron Number Pin
Arindom_chanda at 26-Aug-13 23:29
memberArindom_chanda26-Aug-13 23:29 
GeneralMy vote of 1 Pin
MAN2MAN at 25-Dec-12 0:49
memberMAN2MAN25-Dec-12 0:49 
Questionhow big my images? Pin
amin_nmer at 1-Dec-12 23:58
memberamin_nmer1-Dec-12 23:58 
GeneralMy vote of 4 Pin
omyildirim at 30-Nov-12 3:39
memberomyildirim30-Nov-12 3:39 
Bug[IMPORTANT] The implementation contains MAJOR BUGS Pin
giuseppemag at 28-Jun-12 0:32
membergiuseppemag28-Jun-12 0:32 
GeneralRe: [IMPORTANT] The implementation contains MAJOR BUGS Pin
sachitha12345 at 5-Nov-12 21:26
membersachitha123455-Nov-12 21:26 
Questionmatching algorithm Pin
Member 8138227 at 21-Apr-12 20:19
memberMember 813822721-Apr-12 20:19 
GeneralMy vote of 5 Pin
praneshkmr at 24-Jan-12 19:45
memberpraneshkmr24-Jan-12 19:45 
QuestionPlease help ! Pin
hovu at 20-Jan-12 5:40
memberhovu20-Jan-12 5:40 
QuestionEntering and outputting more than one character Pin
ibnkhaldun at 2-Jan-12 11:41
memberibnkhaldun2-Jan-12 11:41 
Questioninput value not the same as output value Pin
Member 3913283 at 3-Dec-11 6:22
memberMember 39132833-Dec-11 6:22 
GeneralMy vote of 3 Pin
HansiHermann at 28-Oct-11 4:12
memberHansiHermann28-Oct-11 4:12 
Questionflow chart Pin
baguswahyu at 27-Oct-11 3:38
memberbaguswahyu27-Oct-11 3:38 
QuestionLogic behind the programm Pin
Member 8203782 at 13-Sep-11 21:35
memberMember 820378213-Sep-11 21:35 
Questionlittle confused about PreInput and Input Pin
swarajs at 7-Aug-11 19:05
memberswarajs7-Aug-11 19:05 
QuestionGreat Work:-) Pin
chikkisherry at 20-Jul-11 21:56
memberchikkisherry20-Jul-11 21:56 
QuestionIncrease probability Pin
RonZohan at 20-Jul-11 4:30
memberRonZohan20-Jul-11 4:30 
QuestionProblem with self coding Pin
spider853 at 9-Jul-11 16:10
memberspider8539-Jul-11 16:10 
QuestionNeed your guidance... Pin
apepe at 23-Jun-11 17:55
memberapepe23-Jun-11 17:55 
AnswerRe: Need your guidance... Pin
Murat Firat at 24-Jun-11 9:56
memberMurat Firat24-Jun-11 9:56 
GeneralRe: Need your guidance... Pin
apepe at 25-Jun-11 17:32
memberapepe25-Jun-11 17:32 
GeneralRe: Need your guidance... Pin
Murat Firat at 27-Jun-11 1:51
memberMurat Firat27-Jun-11 1:51 
Questionhelp please!!!!! Pin
asmita5 at 29-May-11 8:10
memberasmita529-May-11 8:10 
AnswerRe: help please!!!!! Pin
Bikash_coder at 16-Jun-11 0:09
memberBikash_coder16-Jun-11 0:09 
AnswerRe: help please!!!!! Pin
asmita5 at 16-Jun-11 2:48
memberasmita516-Jun-11 2:48 
GeneralRe: help please!!!!! Pin
Murat Firat at 16-Jun-11 3:17
memberMurat Firat16-Jun-11 3:17 
GeneralQuestion about Number of Input Unit & Number of Hiden Unit Pin
wiswadipa at 27-May-11 3:44
memberwiswadipa27-May-11 3:44 
GeneralHow do I change to recognize the binary pattern Pin
shamlen at 12-May-11 17:30
membershamlen12-May-11 17:30 
GeneralRe: How do I change to recognize the binary pattern Pin
merovingian18 at 29-Mar-12 5:47
membermerovingian1829-Mar-12 5:47 
QuestionHow i can increase hidden node in BP1Layer.cs Pin
Bikash_coder at 16-Feb-11 23:45
memberBikash_coder16-Feb-11 23:45 
AnswerRe: How i can increase hidden node in BP1Layer.cs Pin
Murat Firat at 20-Feb-11 21:35
memberMurat Firat20-Feb-11 21:35 
Generallayer 3 better or not Pin
Member 1964241 at 10-Feb-11 22:17
memberMember 196424110-Feb-11 22:17 
GeneralRe: layer 3 better or not Pin
Murat Firat at 20-Feb-11 21:34
memberMurat Firat20-Feb-11 21:34 
Questionplease help Pin
Bikash_coder at 30-Jan-11 1:11
memberBikash_coder30-Jan-11 1:11 
AnswerRe: please help Pin
Murat Firat at 2-Feb-11 10:50
memberMurat Firat2-Feb-11 10:50 
Generalgreat work... :) Pin
kireina_3012 at 29-Jan-11 11:00
memberkireina_301229-Jan-11 11:00 
QuestionRe: great work... :) Pin
Bikash_coder at 30-Jan-11 1:17
memberBikash_coder30-Jan-11 1:17 
AnswerRe: great work... :) Pin
kireina_3012 at 1-Feb-11 8:59
memberkireina_30121-Feb-11 8:59 
GeneralRe: great work... :) Pin
Murat Firat at 2-Feb-11 10:42
memberMurat Firat2-Feb-11 10:42 
GeneralTarget output Pin
SeasickSailor at 20-Dec-10 3:14
memberSeasickSailor20-Dec-10 3:14 
GeneralDisplaying the image [modified] Pin
swathi6589 at 8-Oct-10 3:56
memberswathi65898-Oct-10 3:56 
GeneralFinger print Reconization Pin
waqas munim at 27-Jul-10 8:02
memberwaqas munim27-Jul-10 8:02 
Generalblood cell images recognition [modified] Pin
kushagra.thakur at 11-Jul-10 17:36
memberkushagra.thakur11-Jul-10 17:36 
GeneralRe: blood cell images recognition Pin
Murat Firat at 12-Jul-10 9:00
memberMurat Firat12-Jul-10 9:00 
Hi Kushagra,

As I see there are two different patterns at the image. However, if 5-6 patterns are used on training, the result may not be sufficient. Anyway, only 2 training images can be used to identify two patterns on this implementation (You can change the implementation to accept more that one training image for a pattern).

Also, the picture got me remember kohonen's SOM, and [my opinion] this kind of problem is much more resolvable by using self organizing maps

Good Luck,
Murat
Questionquestion Pin
diedou at 21-Jun-10 4:41
memberdiedou21-Jun-10 4:41 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.150427.1 | Last Updated 30 Oct 2007
Article Copyright 2007 by Murat Firat
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid