Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

EMGU Multiple Face Recognition using PCA and Parallel Optimisation

0.00/5 (No votes)
5 Oct 2011 1  
Using EMGU to perform Principle Component Analysis (PCA) multiple face recognition is achieved. Using .Net Parallel toolbox real time analysis and optimisation is introduced in a user friendly application.

Source Code (External Hosting)    

Alternative Sourceforge   

 

Introduction   

This article is designed to be the first in several to explain the use of the EMGU image processing wrapper. For more information on the EMGU wrapper please visit the EMGU website . If you are new to this wrapper see the Creating Your First EMGU Image Processing Project article. As you will start with 3 warnings for the references not being found. Expand the References folder within the solution explorer delete the 3 with yellow warning icons and Add fresh references to them. 

Face Recognition has always been a popular subject for image processing and this article builds upon the good work by Sergio Andrés Gutiérrez Rojas and his original article here[^]. The reason that face recognition is so popular is not only it’s real world application but also the common use of principle component analysis (PCA). PCA is an ideal method for recognising statistical patterns in data. The popularity of face recognition is the fact a user can apply a method easily and see if it is working without needing to know to much about how the process is working.

This article will look into PCA analysis and its application in more detail while discussing the use of parallel processing and the future of it’s in image analysis. The source code makes some key improvements over the original source both in usability and the way it trains and the use of parallel architecture for multiple face recognition.

 

Source Code Requirements  

The Program is designed to use a web camera so this is essential. While the program should execute on single core machines be aware that performance may be increased by using the sequential frame processing method. Look at the “Improving the detection performance” section for more details. The x86 source will also run on x64 machines however the x64 source is only for x64 architectures.

Main1.jpg

 

How EMGU the EigenObjectRecognizer Works 

There are 3 different constructors for the EigenObjectRecognizer class. Each constructor takes an array of Grayscale images, each one of these images muct be the same size and it is suggested that the histograms are equalised. The reason histogram equalisation is suggested is that it can produce more desirable results when lighting within the image changes. This is common when using webcameras as they tend to depend on natural lighting that is not always uniform.

Bullet_Method.jpgEigenObjectRecognizer(Image<Gray, Byte>[], MCvTermCriteria)

The simplest constructer takes an array of images, as this recogniser takes only images as it’s training data it will return an image of it’s closest match. This can be useful, especially if you wish to use the data within the closet match to compare against that of the input image.

Bullet_Method.jpg EigenObjectRecognizer(Image<Gray, Byte>[], String[], MCvTermCriteria)

This constructor takes an array of images and an array of strings, both equal in size. The results from this recogniser will be a string. This string can be used as a identifier to control the process flow of a program.

Bullet_Method.jpg EigenObjectRecognizer(Image<Gray, Byte>[], String[], Double, MCvTermCriteria)

This is the constructor we use for face recognition. Like the previous constructor it takes an array of images and an array of strings of the same size. Returned is a string identifier. An additional Double is used to control a eigenDistanceThreshold, the suggested value is 0 - ~1000. In practice this is not always true in the source you will notice a value of 5000 is actually used. The eigenDistanceThreshold sets how likely an examined image will be treated as unrecognized object. If the threshold is < 0, the recognizer will always treated the examined image as one of the known objects.  

The variable MCvTermCritera is the termination criteria for the training of the Eigen Recognizer. As the Eigen Recognizer is a form of Neural Network you can set when you want to stop finding a perfect solution or when the Neural Network is said to have converged. We do this a Neural Network may never be able to find a perfect solution to a problem and convergence will never happen. If you set this value to high then you will receive more errors, set it to low and you will end up with a continuous loop.

Bullet_Method.jpg MCvTermCriteria(Int32, Double)  

We use this constructer for the termination criteria as it allows us to set the constrain of maximum iteration as well as epsilon. Naturally the termination criteria can be set up using only a limit on iteration or epsilon value as shown:

Bullet_Method.jpg MCvTermCriteria(Int32)

Bullet_Method.jpg MCvTermCriteria(Int32, Double)

 

Principle Component Analysis (PCA) 

The EigenObjectRecognizer class applies PCA on each image, the results of which will be an array of Eigen values that a Neural Network can be trained to recognise. PCA is a commonly used method of object recognition as its results, when used properly can be fairly accurate and resilient to noise. The method of which PCA is applied can vary at different stages so what will be demonstrated is a clear method for PCA application that can be followed. It is up for individuals to experiment in finding the best method for producing accurate results from PCA.

To perform PCA several steps are undertaken:

  • Stage 1: Subtract the Mean of the data from each variable (our adjusted data)  
  • Stage 2: Calculate and form a covariance Matrix
  • Stage 3: Calculate Eigenvectors and Eigenvalues from the covariance Matrix
  • Stage 4: Chose a Feature Vector (a fancy name for a matrix of vectors)
  • Stage 5: Multiply the transposed Feature Vectors by the transposed adjusted data  

 

STAGE 1: Mean Subtraction

This data is fairly simple and makes the calculation of our covariance matrix a little simpler now this is not the subtraction of the overall mean from each of our values as for covariance we need at least two dimensions of data. It is in fact the subtraction of the mean of each row from each element in that row.

(Alternatively the mean of each column from each element in the column however this would adjust the way we calculate the covariance matrix)

 

STAGE 2: Covariance Matrix 

The basic Covariance equation for two dimensional data is:

EQ1.jpg

Which is similar to the formula for variance however, the change of x is in respect to the change in y rather than solely the change of x in repect to x. In this equation x represents the pixel value and ̄x is the mean of all x values, and n the total number of values.

The covariance matrix that is formed of the image data represents how much the dimensions vary from the mean with respect to each other. The definition of a covariance matrix is:

EQ2.jpg

Now the easiest way to explain this is but an example the easiest of which is a 3x3 matrix.

EQ3.jpg

Now with larger matrices this can become more complicated and the use of computional algorithms essential.

 

STAGE 3: Eigenvectors and Eigenvalues

Eigenvalues are a product of multiplying matricies however they are as special case. Eigenvalues are found by multiples of the covarience matrix by a vector in 2 dimensional space (i.e. a Eigenvector). This makes the covarience matrix the equvilant of a transformation matrix. It is easier to show in a example:

EQ4.jpg

Eigenvectors can be scaled so ½ or x2 of the vector will still produce the same type of results. A vector is a direction and all you will be doing is changing the scale not the direction.

EQ5.jpg

Eigenvectors are usually scalled to have a length of 1:

EQ6.jpg

Thankfully finding these special Eigenvectors is done for you and will not be explained however there are several tutorials available on the web to explain the computation.

The Eigenvalue is closely related to the Eigenvector used and is the value of which the original vector was scaled in the example the Eigenvalue is 4.

 

STAGE 4: Feature Vectors

Now a usually the results of Eigenvalues and Eigenvectors are not as clean as in the example above. In most cases the results provided are scaled to a length of 1. So here are some example values calculated using Matlab:

EQ7.jpg

Once Eigenvectors are found from the covariance matrix, the next step is to order them by Eigenvalue, highest to lowest. This gives you the components in order of significance. Here the data can be compressed and the weaker vectors are removed producing a lossy compression method, the data lost is deemed to be insignificant.

EQ8.jpg

 

STAGE 5: Transposition

The final stage in PCA is to take the transpose of the feature vector matrix and multiply it on the left of the transposed adjusted data set (the adjusted data set is from Stage 1 where the mean was subtracted from the data).  

The EigenObjectRecognizer class performs all of this and then feeds the transposed data as a training set into a Neral Network . When it is passed an image to recognise it performs PCA and compares the generated Eigenvalues and Eigenvectors to the ones from the training set the Neural Network then produces a match if one has been found or a negative match if no match is found. The is a little more to it than this however the use of Neural Networks is a complex subject to cover and is not the object of this article.

 

The Source Code  

Training the EigenObjectRecognizer

Training of the EigenObjectRecognizer is now done at the start of a program and after trainging data has been added or created. In this version a new training form has been introduced and method for storing the training data introduced.  

Training.jpg

 

The training form allows for a face to recognised and added individually as the program is designed to run from a web cam the faces are recognised in the same method. A feature to acquire 10 successful face classification and add them all or individual ones to the training data has been included. This increase the collection of training data and the amount of images acquired can be adjusted in the Variables region of the Training Form.cs. Increase or decrease num_faces_to_aquire any preferred value.  

    #region Variables            
    ....
        //For aquiring 10 images in a row
        List<Image<Gray, byte>> resultImages = new List<Image<Gray, byte>>();
        int results_list_pos = 0;

        int num_faces_to_aquire = 10;

        bool RECORD = false;
    ....
    #endregion

 

A Classifier_Train class is included it has two constructors the default takes the standard folder path of Application.StartupPath + "\\TrainedFaces" which is also the default save location of the training data. If you wish to have different sets of training data then another constructer carries a string containing the training folder. The program only makes use of the default constructor, it is included to allow for development. The class sole design is to make the Form code more readable. To alter the default path the following functions must be corrected:

    //Forms
    private bool save_training_data(Image face_data) //Training_Form.cs*
    private void Delete_Data_BTN_Click(object sender, EventArgs e) //Training_Form.cs*
    
    //Classes
    public Classifier_Train() //Classifier_Train.cs

 

Storing of the Training Data

The training data default location is within the TrainedFaces folder of the application path. It has a single Xaml file that contains tags for the Name of the person and a file name for the training image. The Xaml file has the following structure:

    <Faces_For_Training>
        <FACE>
  		<NAME>NAME</NAME> 
  		<FILE>face_NAME_2057798247.jpg</FILE> 
  	</FACE>
    </Faces_For_Training>

 

This structure can be easily changed to work with extra data or with another layout. The following functions must be adjusted to accommodate the extra information and extra variables added where require.

    //Forms
    private bool save_training_data(Image face_data) //Training_Form.cs*
    //Classes
    private bool LoadTrainingData(string Folder_loacation) //Classifier_Train.cs

 

Each image is saved using a random number so that unique file identifiers can be genrated. This prevents images being overwritten and easily allows several images for one individual to be aquired and stored with no problems.

    Random rand = new Random();
    bool file_create = true;
    string facename = "face_" + NAME_PERSON.Text + "_" +    rand.Next().ToString() + ".jpg";
    while (file_create)
    {
        if (!File.Exists(Application.StartupPath + "/TrainedFaces/facename"))
        {
            file_create = false;
        }
        else
        {
        facename = "face_" + NAME_PERSON.Text + "_" + rand.Next().ToString() + ".jpg";
        }
    }

 

The Training form allows for data to be added to the training set, it has been noted that this process can be slow. While a quicker method would be to load and write all the data at open and close respectively this has not been included. If such and action where taken memory management would have to be carefully considered so that the amount of training images does not cause memory problems.  

A jpeg encoder is used to store the images however this could be changed to a bitmap encoder to prevent any data loss see the following functions within the Traing_Form.cs* file:

    //Saving The Data
    private bool save_training_data(Image face_data)
        
    private ImageCodecInfo GetEncoder(ImageFormat format)

 

Improving EigenObjectRecognizer Accuracy  

You will notice if your run the program without alterations train it on yourself and introduce another untrained face it will be recognised as you. Well there are several ways to improve the accuracy of the EigenObjectRecognizer. As mention this is the constructor we use:

Bullet_Method.jpg EigenObjectRecognizer(Image<Gray, Byte>[], String[], Double, MCvTermCriteria)

 

The Double is used to control the eigenDistanceThreshold, the suggested value is 0 - 1000. The eigenDistanceThreshold sets how likely an examined image will be treated as unrecognized object. In the source a value of 5000 is actually used this means that a face will always be recognised. Alternatively minus numbers have a similar effect. You can reduce this value to improve the recogniser accuracy, for example 500 is fairly accurate.

There are problems with this however, if your facial position does not match closely to your training set facial positions an empty string will be returned. This can be overcome by producing a large amount of training data however this is time consuming. By adjusting the size of your training set, the position of the face within them and the eigenDistanceThreshold you can eventually come to a good balance.

Histogram equalisation is also used to improve accuracy, this produces a more uniform image that is more resilient to changes in lighting. Alternative methods could also be taken to produce unique training sets. You could just take eyes and mouth features concatenate the data and use this however further experimentation would be required.

    result = currentFrame.Copy(face_found.rect).Convert<Gray, byte>().Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
    result._EqualizeHist();

 

The eigenDistanceThreshold is not the only value you can change you could adjust the classifiers Termination Critera.

    MCvTermCriteria termCrit = new MCvTermCriteria(ContTrain, 0.001); //Classfier_Train.cs

By increasing ContTrain which is the maximum iterations of which the classifier will attempt before giving up you can give the Neural Network longer to solve to complex equation and produce more accurate results. However you must also change 0.001 to a lower value, this float represents epsilon. This is the target for acceptable error if this is reached before the maximum iterations is met then the Neural Network will stop training. In the source code example ContTrain is set to 0 which means that the Neural Network will continue until it’s epsilon value reaches 0.001 regardless of iteration count.  

However be warned you could produce an eternal loop by setting the maximum iterations to an infinite value (0) and epsilon to 0. Remember also that the more iterations and the lower epsilon the longer the classifier will take to train.

You could also try larger training images the following code in both forms resize the face to a 100x100 image increasing this could increase accuracy. Be warned however both occurrences must be change and the larger this image the longer training and recognition time required.

    result = currentFrame.Copy(face_found.rect).Convert<Gray, byte>().Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC); //Training Form.cs* & Main Form.cs*

Trying different Haar Classfiers for face detection can also have an effect this is discussed bellow.

 

Improving the detection of Faces 

Included in the source are 4 different Haar cascades for facial recognition, they are all situated in the application start-up folder “Cascades”. To try the other change the following line of code to the correct filename.  

haarcascade_frontalface_alt2.xml” produced the best results from testing however you may prefer one of the alternatives as many of these only detect faces in certain conditions i.e. facing the camera directly. This can help improve the accuracy of the recogniser and require less training data.  

 

Improving the detection performance

Rather than focus on improving the performance for slower processors this program is designed to increase the performance on modern machines. It is often a desired, although impractical, to have real time image processing. Real time processing is closely linked to the accuracy of an image-processing algorithm. Faster algorithms are a result of processing less data and their ability to determine true from false data is inherently flawed. In video acquisition 30 frame per-second is deemed as standard, it’s faster than what our eyes can cope with and thus provided smooth movement. In real world applications this is to slow for a computer to be accurate.

Modern high speed cameras can acquire 640 x 480 images at 300 Fps, putting a standard web camera to shame. These high end cameras use specific frame grabbers that deal with image acquisition. It is unlikely that the standard user will encounter such speeds maybe 60 Fps at best, but what is important is the way in which these camera do image pre-processing at real time speeds. The frame grabbers have FPGA (Field-programmable Gate Array) chips integrated onto the card. These deal with producing the images but can also do processors such as histogram equalisation and edge detection at the same time. To understand how this is achieved it is important to point out what an FPGA chip is and its architecture.

An FPGA is in simple terms a high end processor which can be designed to do a specific operation if you have a smart phone then what do you think is running it (An ARM processor is an advanced FPGA). On a computer you could design one to run word, another to run a browser and another to run games. Obviously the complexity and practicality prevents this. An FPGA processor can be designed to have an extremely parallel architecture. So while performing edge detection you could also be performing object recognition.

While FPGA use is beyond the scope of this article a parallel architecture for image-processing can be produced. Many users of visual studio will have come across threading application before. This is where the processing of data is spread across the cores of your computer. This use to be complex and require a large amount of experience however Microsoft have invested a lot of time into parallel computing. Here is a link to the hompage http://msdn.microsoft.com/en-us/concurrency/default. The programmers at the Visual Studio base have produced a set of classes that will parallel almost any loop you use. Performance increase depends on your machine and the number of physical cores. An 8 core i7 only has 4 physical cores and performance increase of x3.7 on average are seen.

Do not jump straight in and make everything you own parallel. It’s use is hit and miss and performance must be examined. It’s use can increase the execution time and can easily eat all the memory up on your computer. It is also dependant on what other applications you are running.

Remember 2 things:  

  1. Think how the computer works if you are only doing a small amount of processing the computer must share all the resources, and the tell each processor what to processes. Then it must gather the results deal with them an repeat the process until your loop is exhausted. Sometimes allowing one processor to deal with the information can be quicker. A stopwatch is your friend here time both instances and see what happens. Also remember your machine is not everyone else’s you may have 8 cores but your end user may still be stuck with just 1.  
  2. A few simple rules, each processor runs without looking at what other processors are doing. Do not use parallel loops within parallel loops as this will hinder performance. Do not set a task in which the output of the second loop is dependent on the first else it will not work. Similarly if the results being recorded are dependant from each loop errors will also occur. Non-dependant operations such as bellow are your friend. Any parallel loop can be buggy at times so a try catch statement is useful to keep
    //variables
    Int Count = 0
    Image<Gray,Byte> Image_Blank_Copy = My_Image.CopyBlank();
    ...

    Parallel.For....

        Count += 10;
        Count -= 10;
        Count++;
        
        //or for an image
        Image_Blank_Copy.Data[y,x,0] += 10;

Also a word of warning setting an ROI in an image and then using that within a loop considerably slower than simply copying that area to a new image and processing it so for example:

    //Bad and Slow
    My_Image.ROI = new Rectangle(0,0,100,100);
    Parallel.For(0, My_Image.Width, (int i) =>
    {
        for (int j = 0; j < My_Image.Height; j++)
        {
            //Do something
        }
    });

    //Good and Fast
    My_Image.ROI = new Rectangle(0,0,100,100);

    Using(Image<Bgr,Byte> tempory_image = My_Image.Copy())
    {
        Parallel.For(0, tempory_image.Width, (int i) =>
        {
            for (int j = 0; j < tempory_image.Height; j++)
            {
                //Do something
            }
        });
    }

 

To access Parallel.For , Parallel.ForEach, Task and ThreadPool you will need to add the following using statements:  

    using System.Threading;
    using System.Threading.Tasks;

 

In the source code provide a parallel foreach loop is used. This means that each face is recognised using a separate thread. So for each face detected the information is passed to the recogniser the be classified independently. While performance on one face is not really seen if there a several people within a room each one can be recognised independently. This is very useful if you are using a large amount of training data as the more possibilities the EigenObjectRecognizer has for an output the longer it will take for an accurate classification. A try catch statement is used to filter out errors that can occur sporadically however this does not effect performance or accuracy.

    Parallel.ForEach(facesDetected[0], face_found =>
    {
        try
        {
            //Resize the image
            result = currentFrame.Copy(face_found.rect).Convert<Gray, byte>().Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
            result._EqualizeHist(); //Equalise the Histogram
            //draw the face detected in the 0th (gray) channel with blue color
            currentFrame.Draw(face_found.rect, new Bgr(Color.Red), 2);
    
            if (Eigen_Recog.IsTrained)
            {
                string name = Eigen_Recog.Recognise(result);
                //Draw the label for each face detected and recognized
                currentFrame.Draw(name, ref font, new Point(face_found.rect.X - 2, face_found.rect.Y - 2), new Bgr(Color.LightGreen));
                //Add the face detected and name to the RHS panel
                ADD_Face_Found(result, name);
            }
    
        }
        catch
        {
            //do nothing as parrellel loop buggy
            //No action as the error is non important, it is simply an error in 
            //no data being there to process and this occurs sporadically
            //Correction will be made when solution found
        }
    });

 

The performance increase doesn’t end there. In the main program a display of the last 5 faces detected is shown on a right hand side panel. These controls are created and shown programatically and in parrellel. As this is done in a parrellel loop each variable must be independant from actions within a loop. The important functions are shown bellow, you will notice that the location of each component is controlled be the variables faces_panel_X, faces_panel_Y any operation on these variables is independent and goes of it’s current value.

    void Clear_Faces_Found()
    void ADD_Face_Found(Image<Gray, Byte> img_found, string name_person)
    {
        ...
        PI.Location = new Point(faces_panel_X, faces_panel_Y);
        ...
        LB.Location = new Point(faces_panel_X, faces_panel_Y + 80);
        ...
          
        faces_count++;
        if (faces_count == 2)
        {
            faces_panel_X = 0;
            faces_panel_Y += 100;
            faces_count = 0;
        }
        else faces_panel_X += 85;
        ...

    }

 

You can control the amount of faces shown by adjusting when the control panel is cleared. As there is a picturebox and a label per face you must times the amount of faces by two. In this case 10/2 = 5 faces and names are shown.

    if (Faces_Found_Panel.Controls.Count > 10)

 

Changing Between Parallel and Sequential Execution

As users may want to investigate the performance increase both the parallel processing of the facial recognition and sequential processing functions are included. The default is the parallel method. Within the Main Form.cs* code you will see two functions

    //Process Frame
    void FrameGrabber_Standard(object sender, EventArgs e)//This is the Sequential
    void FrameGrabber_Parrellel(object sender, EventArgs e)//and this the Parallel 

 

Which one of this is used is controlled by the Camera Start and Stop functions again within Main Form.cs*.

    //Camera Start Stop
    public void initialise_capture()
    {
        grabber = new Capture();
        grabber.QueryFrame();
        //Initialize the FrameGraber event
        Application.Idle += new EventHandler(FrameGrabber_Parrellel);
    }
    private void stop_capture()
    {
        Application.Idle -= new EventHandler(FrameGrabber_Parrellel);
        if(grabber!= null)
        {
            grabber.Dispose();
        }
    }

 

You must change the following two lines of code and redirect them to the sequential function.

    Application.Idle += new EventHandler(FrameGrabber_Parrellel); //initialise Capture
    Application.Idle -= new EventHandler(FrameGrabber_Parrellel);//Stop Capture
    
    //becomes
    
    Application.Idle += new EventHandler(FrameGrabber_Standard); //initialise Capture
    Application.Idle -= new EventHandler(FrameGrabber_Standard);//Stop Capture

The parallelisation of image processing code is an important and the buck does not stop with threading of the code. It is easy to implement but with practice it can be implemented well. There is a newcomer to EMGU as well which utilises CUDA graphics processing. This topic is more advanced and will not be coved as not everyone has CUDA enable graphics cards, but in basics it allows a higher level of parallelisation as rather than 4 or 8 cores you can work with hundreds it’s easy to speculate the improvements that can be made in execution time. The PedestrianDetection example that ships with EMGU shows how this can be implemented.

 

Conclusion

This article while explaining the principles of PCA introduces the important subject of decreasing execution time by parallelisation. While the software and the article demonstrate a small example of it’s implementation it importance must be noted. With the increasing affordability of multi-core processors and CUDA based graphic cards image processing in real time is more accessible. Advanced microelectronics are no longer required to speed up simpler image processing systems and the decrease in development time allows image processing to be used by more individuals.

If you feel improvements or correction can be made to this article please post a comment and it will be addressed.

 

Acknowledgement

Many Thanks to all of the 2006 research group of the Mineralogy and Crystallography department at the University of Arizona who’s group photo of was used for demonstration purposes. http://www.geo.arizona.edu/xtal/group/group2006.htm  

If you do not wish you image to be used please post a comment or contact me and I’ll remove any references.  

Thank you to Sergio Andrés Gutiérrez Rojas whose code inspired this article. I hope you continue with your good work in Image processing.  

Many thanks to http://kiwi6.com for hosting the program files and allowing hotlinking.

History

[1] Direct links applied for an external file hosting website via kiwi6.com. Apologies if these links fail code project do not allow such large files to be hosted. To prevent downtime Sourceforge links are also provided.

[2] Direct links set to open a new window as there hotlinking now re-directs traffic. A new host will be looked for.  X64 bit version is now hosted on private website, still looking for a host for the larger x86 version.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here