Click here to Skip to main content
15,885,278 members
Articles / Desktop Programming / Win32

Matching (and Analyzing) Using Principal Component Analysis

Rate me:
Please Sign up or sign in to vote.
3.25/5 (8 votes)
26 Sep 2009MIT2 min read 35.1K   1.2K   12   5
PCA is a well known algorithm to extract features from multidimentional datasets. Using a library I wrote, this is made easy.

Introduction

The PCA matching algorithm analyzes data coordinates from n dimensions, and returns an orthogonal coordinate representing the main deviation of the data.

The coordinate of the main deviation of the data represents the direction of the given coordinate's density. This information can be useful to minimize the amount of dimensions with minimal loss of data, for lossy compression.

Another use is to align two sets of data that parametrically differ, like aligned, back rotated/scaled/placed shapes.

Background

Image 1

The Principal Components Analysis (a.k.a. PCA algorithm) based matching is part of the “Shape matching framework” designed to provide core support when building a drawing - similarity/difference software using .NET.

The project uses and depends only on the Matrix library implementation provided with the “Shape matching framework” solution. We can easily isolate those two projects/DLLs to get just the functionality of this algorithm.

There are two methods of usage for this library: One is using the ‘plain’ PCA algorithm to extract drawings’ coordinate’s properties. The other is using the provided matching algorithm to optimize the target drawing closer to the source.

Using the code

In the PCA project, there is a PCAtransform implementation which is the plain, by-the-book implementation. This gives us a set of Eigen values and corresponding set of Eigenvectors returned as a DoubleMatrix. From those two parameters, we can deduce the relative angle and the relative density of the two calculated inputs.

In the following example, we will find that two rotated sets of coordinates can be identified as rotated, using two PCA transforms.

We’ll also see that density (same as scaling) factors are identical for both inputs, which will point to the fact that both inputs shouldn’t be scaled in order to be consolidated.

Note, if scaling needed, use the square roots of the scaling factors that the PCA transform finds.

C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;
using LiniarAlgebra;
using Adaption;
using System.Drawing;
using System.Reflection;
using PCA;

namespace New_Project
{
    static class Program
    {
        [STAThread]
        static void Main()
        {
            //Preparing rotation matrix 30 deg. counterclockwise
            double angle = Math.PI/180 * 30; //~= 0.52...
            DoubleMatrix rotationMatrix = PrepareRotationMatrix(angle);
            //rotationMatrix seems like:
            //  cos(30) -sin(30)
            //  sin(30)  cos(30)

            //This will create 2x3 matrix of zeroes
            DoubleMatrix matrix1 = new DoubleMatrix(2, 3);
            //Lets fill some data here too
            matrix1[0, 0] = 1;
            matrix1[0, 1] = 2;
            matrix1[0, 2] = 3;
            matrix1[1, 0] = 4;
            matrix1[1, 1] = 5;
            matrix1[1, 2] = 6;
            //matrix1 looks like this:
            //         1 2 3
            //         4 5 6

            //Creating a copy of the 1st matrix , this copy will be rotated
            DoubleMatrix matrix2 = new DoubleMatrix((Matrix<double>)matrix1.Clone()); 
            //now rotating it
            matrix2 = rotationMatrix * matrix2;
            //matrix2 looks like this: (same form but rotated 30 deg. c.clockwise
            //       4.10   5.24   6.39
            //      -0.37  -1.20  -2.03

            //Now we'll see that the data of both matrices has same PCA properties
            
            //Creating PCA transform instances for both martices
            PCAtransform transform1 = new PCAtransform(matrix1);
            PCAtransform transform2 = new PCAtransform(matrix2);
            //The result from calculate is a MxM dims eigenVectors matrix
            DoubleMatrix eigenVects1 = transform1.Calculate();
            DoubleMatrix eigenVects2 = transform2.Calculate();

            //It is enough for us to get the absolute 
            //angle of the first vector (the first column)
            double absAngle1 = Math.Atan(eigenVects1[1, 0] / eigenVects1[0, 0]);
            double absAngle2 = Math.Atan(eigenVects2[1, 0] / eigenVects2[0, 0]);

            //Lets see that the relative angle is the same 
            //as the rotation angle from the beginning
            double relativeAngle = absAngle2 - absAngle1; //~=0.52...
            // = -0.00000000000000011102230246251565 ~= 0 
            double anglesDiff = relativeAngle - angle;
            //Not exactly because of precision pervertion, 
            //but this is enought to get convinced

            //Another aspect of similarity is by eigenvalues 
            //(they represen the squares of the scaling factors):
            double[] eigenVals1 = transform1.EigenValues;
            //This array looks like this: 0,2
            double[] eigenVals2 = transform2.EigenValues;
            //This array looks like this: 0,2            
        }

        private static DoubleMatrix PrepareRotationMatrix(double angle)
        {
            DoubleMatrix rotationMatrix = new DoubleMatrix(2);
            rotationMatrix[0, 0] = Math.Cos(angle);
            rotationMatrix[0, 1] = -1 * Math.Sin(angle);
            rotationMatrix[1, 0] = Math.Sin(angle);
            rotationMatrix[1, 1] = Math.Cos(angle);
            return rotationMatrix;
        }
    }
}

The PCA transform is also good for data analysis and lossy compression. Since that was not the aim of my work, that will not be explained here. More information about such usages can be found in the internet. Suppose we only want to match a drawing or a coordinates set to a given source set; it can be more easily done using PCA matching. The data we use will be from the previous example.

C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;
using LiniarAlgebra;
using System.Drawing;
using System.Reflection;
using PCA;

namespace New_Project
{
    static class Program
    {
        [STAThread]
        static void Main()
        {
            //Preparing rotation matrix 30 deg. counterclockwise
            double angle = Math.PI / 180 * 30; //~= 0.52...
            DoubleMatrix rotationMatrix = PrepareRotationMatrix(angle);
            //rotationMatrix seems like:
            //  cos(30) -sin(30)
            //  sin(30)  cos(30)

            //This will create 2x3 matrix of zeroes
            DoubleMatrix matrix1 = new DoubleMatrix(2, 3);
            //Lets fill some data here too
            matrix1[0, 0] = 1;
            matrix1[0, 1] = 2;
            matrix1[0, 2] = 3;
            matrix1[1, 0] = 4;
            matrix1[1, 1] = 5;
            matrix1[1, 2] = 6;
            //matrix1 looks like this:
            //         1 2 3
            //         4 5 6


            //Creating a copy of the 1st matrix , this copy will be rotated
            DoubleMatrix matrix2 = new DoubleMatrix((Matrix<double>)matrix1.Clone());
            //now rotating it
            matrix2 = rotationMatrix * matrix2;
            //matrix2 looks like this: (same form but rotated 30 deg. c.clockwise
            //       4.10   5.24   6.39
            //      -0.37  -1.20  -2.03

            //Preparing and calculating the matching object
            PCAMatching matching = new PCAMatching(matrix1, matrix2);
            matching.Calculate();

            //Getting the transformed data of matrix2 
            //as much as possible closer to matrix1
            DoubleMatrix result = matching.Result;
            //The result is:
            //      1   2   3
            //      4   5   6

            double calculatedAngle = matching.AngleFromSource;
            //calculatedAngle = -0.5235987755982987 the opposite to the original rotation
        }
    }
}

Points of interest

This algorithm opened my mind in some ways. Before starting coding, I didn't imagine that there would be a matching algorithm that does the job with such great complexity (O(n)). Of course, it has its own disadvantages, but this is a good start.

This article and the included projects are part of the Shape-Matching framework that can be found at http://sites.google.com/site/smfmproject/. As you can see, with some additional work, it matches shapes even greater:

Image 2

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Software Developer
Israel Israel
A student for a first degree (BSC) in Computer Science.
Working as software developer in a local software company.

Comments and Discussions

 
GeneralPrincipal Component Analysis in C# [modified] Pin
César de Souza30-Oct-09 17:51
professionalCésar de Souza30-Oct-09 17:51 
GeneralMy vote of 2 Pin
Günther M. FOIDL24-Oct-09 5:33
Günther M. FOIDL24-Oct-09 5:33 
GeneralRe: My vote of 2 Pin
yanirta26-Oct-09 3:19
yanirta26-Oct-09 3:19 
GeneralRe: My vote of 2 Pin
Günther M. FOIDL31-Oct-09 14:29
Günther M. FOIDL31-Oct-09 14:29 
GeneralMy vote of 2 Pin
Dave Kreskowiak26-Sep-09 16:06
mveDave Kreskowiak26-Sep-09 16:06 
For such a large topic, I would have expected a more in-depth discussion of the algorithm along with images depicting concepts.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.