GDI+Advanced Intermediate Dev Windows C#

Large pattern recognition system using multi neural networks

Vietdungiitb

4.94/5 (71 votes)

May 2, 2012

CPOL

7 min read

228837

253384

Tutorials of using multi neural networks for large pattern recognition system, handwriting recognition system

Introduction

Now a day, artificial neural network has been applied popularly in many fields of human life. However, creating an efficient network for a large classifier like handwriting recognition systems is still a big challenge to scientists. In my last article named “Library for online handwriting recognition system using UNIPEN database”, I presented an efficient library for a handwriting recognition system which can create, change a neural network simply. The demo program showed good recognition results to digit set (97%) and alphabet sets (93%).This article I will continue to present a solution for a large patterns classification in general and handwriting recognition in particular.

Recognition rate significantly increate when using additional spell checker module

Neural network for a recognition system

In the traditional model of pattern recognition, a hand-designed feature extractor gathers relevant information from input and eliminates irrelevant variabilities. A trainer classifier (normally, a standard, fully-connected multi-layer neural network can be used as a classifier) then categorizes the resulting feature vectors into classes. However, it could have some problems which should influent to the recognition results. The convolution neural network (CNN) solves this shortcoming of traditional one to achieve the best performance on pattern recognition task.

The CNNs is a special form of multi-layer neural network. Like other networks, CNNs are trained by back propagation algorithms. The difference is inside their architecture. The convolutional network combines three architectural ideas to ensure some degree of shift, scale, and distortion invariance: local receptive field, shared weights (or weight replication) spatial or temporal sub-sampling. They have been designed especially to recognize patterns directly from digital images with the minimum of pre-processing operations. The architecture details of CNN have been described comprehensively in articles of Dr. Yahn LeCun and Dr. Patrice Simard (see my previous articles).

Figure 1: The Architecture of LeNET 5

Figure 2: An input image followed by a feature map performing a 5 × 5 convolution and a 2 x 2 sub-sampling map

The recognition results of the above networks are really high to small patterns collection such as digit, capital letters or lower case letters etc. However, when we want to create a larger neural network which can recognize a bigger collection like digit and English letters (62 characters) for example, the problems begin appear. Finding an optimized and large enough network becomes more difficult, training network by large input patterns takes much longer time. Convergent speech of the network is slower and especially, the accuracy rate is significant decrease because bigger bad written characters, many similar and confusable characters etc. Furthermore, assuming we can create a good enough network which can recognize accurately English characters but it certainly cannot recognize properly a special character outsize its outputs set (a Russian or Chinese character) because it does not have expansion capacity. Therefore, creating a unique network for very large patterns classifier is very difficult and may be impossible.

The proposed solution to the above problems is instead of using a unique big network we can use multi smaller networks which have very high recognition rate to these own output sets. Beside the official output sets (digit, letters…) these networks have an additional unknown output (unknown character). It means that if the input pattern is not recognized as a character of official outputs it will be understand as an unknown character. Then the input pattern will be transferred to the next network until the system can recognize it correctly.

Figure 3: Convolution neural network with unknown output

Figure 4: Recognition System using multi neural networks

This solution overcomes almost limits of the traditional model. The new system includes a several small networks which are simple for optimizing to get the best recognition results. Training these small networks takes less time than a huge network. Especially, the new model is really flexible and expandable. Depending on the requirement we can load one or more networks; we can also add new networks to the system to recognize new patterns without change or rebuilt the model. All these small networks have reusable capacity to an other multi neural networks system.

Experiment

The demo program is built to the purpose showing all stages of a recognition system including: create a component network, train a network, test networks on UNIPEN dataset and test networks on a mouse drawing control. It is tutorials which can help everybody can understand to a recognition system. All functions can be implemented on the program GUI. So you can create, train, and test your network on runtime without change any code or restart the program.

Figure 5: Handwriting recognition system interface

Creating new neural network

Figure 6: Creating new neural network Interface

Creating new neural network completely bases on GUI. Creating a network depends on the input pattern size, number of layers, data set…. On the output layer we can choose unknown output checkbox to create an additional unknown output to the network or ignore it to create a normal network.

Of course, we can still to create a network by code:

         void CreateNetwork()
<pre>        {
            network = new ConvolutionNetwork();
            //layer 0: inputlayer
            network.Layers = new Layer[6];
            network.LayerCount = 6;
            InputLayer inputlayer = new InputLayer("00-Layer Input", new Size(29, 29));
            network.InputDesignedPatternSize = new Size(29, 29);
            inputlayer.Initialize();
            network.Layers[0] = inputlayer;
            ConvolutionLayer convlayer = new ConvolutionLayer("01-Layer ConvolutionalSubsampling", inputlayer, new Size(13, 13), 10, 5);
            convlayer.Initialize();
            network.Layers[1] = convlayer;
            convlayer = new ConvolutionLayer("02-Layer ConvolutionalSubsampling", convlayer, new Size(5, 5), 60, 5);
            convlayer.Initialize();
            network.Layers[2] = convlayer;
            FullConnectedLayer fulllayer = new FullConnectedLayer("03-Layer FullConnected", convlayer, 200);
            fulllayer.Initialize();
            network.Layers[3] = fulllayer;
            fulllayer = new FullConnectedLayer("04-Layer FullConnected", fulllayer, 100);
            fulllayer.Initialize();
            network.Layers[4] = fulllayer;
            OutputLayer outputlayer = new OutputLayer("05-Layer Output", fulllayer, Letters3.Count, true);
            outputlayer.Initialize();
            network.Layers[5] = outputlayer;
            network.TagetOutputs = Letters3;
            network.UnknownOuput = '?';
        }

Training a network

After creating a neural network using "Create network" function, the network will be trained using UNIPEN database.

Figure 7: Training network interface

Depending on the network size we can choose training set is 1a, 1b or 1c in the UNIPENdata folder. Statistic of training process can show many useful information such as: No. of epoch, MSE, training time per epoch, success rate…

UNIPEN data browser and recognition testing

The UNIPEN data browser control in the demo program can show all the UNIPEN data files. We can also test the trained neural network on these files by loading trained network parameters files.

Figure 8: UNIPEN data browser and recognition interface

Mouse Drawing test

Figure 9: Mouse drawing recognition interface

The mouse drawing control is based on the excellent article ”DrawTools” by Alex Fr. I just changed some codes to fit to my requirement. The cursive text in the image is divided to line, word and isolated character by same algorithm as follows:

    private void btRecognition_Click(object sender, EventArgs e)
<pre>        {
            //recognition all characters in the drawArea
            if (bitmap != null)
            {
                bitmap.Dispose();
                bitmap = null;
            }
            bitmap = new Bitmap(drawArea.Width, drawArea.Height);
            drawArea.DrawToBitmap(bitmap, new Rectangle(0, 0, bitmap.Width, bitmap.Height));
            drawBitmap =(Bitmap) bitmap.Clone();
            if (bitmap != null)
            {
                lbRecognizedText.Items.Clear();
                List<InputPattern> lineList=null;
                List<InputPattern> wordList=null;  
                InputPattern parentPt=new InputPattern(bitmap,255,new Rectangle(0,0,bitmap.Width,bitmap.Height));
                lineList = GetPatternsFromBitmap(parentPt,500,1,true,10,10);
                if (lineList.Count > 0)
                {
                        
                    if (characterList != null)
                    {
                        characterList.Clear();
                        characterList = null;
                    }
                    characterList = new List<InputPattern>();
                    foreach (var line in lineList)
                    {
                        String text = "";
                        wordList = GetPatternsFromBitmap(line, 50, 10,false, 10, 10);
                        if (wordList != null)
                        {
                            if (wordList.Count > 0)
                            {
                                foreach (var word in wordList)
                                {
                                    List<InputPattern> charList = GetPatternsFromBitmap(word, 5, 5, false, 10, 10);
                                    //check if have part bitmaps
                                    if (charList != null)
                                    {
                                        if (charList.Count > 0)
                                        {
                                            panelNavigation.Visible = true;
                                            foreach (var c in charList)
                                            {
                                                characterList.Add(c);
                                                c.GetPatternBoundaries(5,5,false,10,10);
                                                Char accChar = new Char();
                                                PatternRecognition(c.OriginalBmp,out accChar);
                                                if (accChar != '\0')
                                                {
                                                    text = String.Format("{0}{1}", text, accChar.ToString());
                                                    drawBitmap = c.DrawChildPatternBoundaries(drawBitmap);
                                                }
                                            }
                                        }
                                    }
                                    text = String.Format("{0} ", text);
                                }
                                  
                            }
                        }
                        lbRecognizedText.Items.Add(text);
                    }
                }
                pbPreview.Image = drawBitmap;
                lblNavigation.Text = characterList.Count.ToString();
                index = 0;
            }
            
        }

Figure 10: Loading trained network parameters files

In order to active the recognition function I simply load trained network parameters files. Depending to my recognition requirement I can load one, two or all files. The recognition results are really good (higher 90%) if I load only one network to recognize its output characters. However, when I load multi network the system’s accuracy rate becomes lower. The main reasons are many confusable characters in cursive text; the training sets are not large enough etc.

For a large pattern collection like handwritten characters, there are so many similar characters which can make not only machine but also human confuse in some cases such as: O, 0 and o; 9, 4,g,q etc. These characters can make networks misrecognize. Hence the solution has been being upgraded which significant increate recognition rate by using an additional spellchecker/voting module at the output of system. The input pattern will be recognized by all component networks. These outputs (except unknown outputs) then will be set as the inputs of the spellchecker/voting module. The module will bases on previous recognized characters, internal dictionary and other factors to decide which one will be the most accurated recognized character.

Figure 11: The new recognition system using Spell checker /voting module

The new recognition system using Spell checker /voting module (internal dictionary)

The spellchecker module makes the system recognizes much better

Conclusion

The proposed recognition model has solved amost prolems to a large recognition system: the capacity of recognizing large partern collection, flexible design and deployment, expanable and resuable capacity...etc. Increasing accuracy rate to the system also can do easier by increasing recognition rate of component networks, using the spell checker /voting module etc. The demo program also proved the capacity of the library which should be used in many other applications such as prediction application, face recognition...

Fututre work and upgrade

Some features would be udate to the library:

- Convolution and sampling layer of LeNET model.

- Spell checker / voting module

-character segmentation.

At the moment, the project took to much my free time. It should be slowdown or temporary stop until I can re-arrange everything and/or find a new good sponsorship. Howerver the vote/comment to the article would decice the project will continue or not. I will really appreciate to receive comments and suggessions to the article especially to the model, spell checker module and character segmentation algorithm...

History

version 1.0: initial code

version 1.1 the spell checker /voting module has been added to the system which increates significantly recognition rate. It made me really supprised and happied. I will publish it when I complete code rearrangement.