Click here to Skip to main content
Click here to Skip to main content

Matching (and Analyzing) Using Principal Component Analysis

, 26 Sep 2009
Rate this:
Please Sign up or sign in to vote.
PCA is a well known algorithm to extract features from multidimentional datasets. Using a library I wrote, this is made easy.

Introduction

The PCA matching algorithm analyzes data coordinates from n dimensions, and returns an orthogonal coordinate representing the main deviation of the data.

The coordinate of the main deviation of the data represents the direction of the given coordinate's density. This information can be useful to minimize the amount of dimensions with minimal loss of data, for lossy compression.

Another use is to align two sets of data that parametrically differ, like aligned, back rotated/scaled/placed shapes.

Background

The Principal Components Analysis (a.k.a. PCA algorithm) based matching is part of the “Shape matching framework” designed to provide core support when building a drawing - similarity/difference software using .NET.

The project uses and depends only on the Matrix library implementation provided with the “Shape matching framework” solution. We can easily isolate those two projects/DLLs to get just the functionality of this algorithm.

There are two methods of usage for this library: One is using the ‘plain’ PCA algorithm to extract drawings’ coordinate’s properties. The other is using the provided matching algorithm to optimize the target drawing closer to the source.

Using the code

In the PCA project, there is a PCAtransform implementation which is the plain, by-the-book implementation. This gives us a set of Eigen values and corresponding set of Eigenvectors returned as a DoubleMatrix. From those two parameters, we can deduce the relative angle and the relative density of the two calculated inputs.

In the following example, we will find that two rotated sets of coordinates can be identified as rotated, using two PCA transforms.

We’ll also see that density (same as scaling) factors are identical for both inputs, which will point to the fact that both inputs shouldn’t be scaled in order to be consolidated.

Note, if scaling needed, use the square roots of the scaling factors that the PCA transform finds.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;
using LiniarAlgebra;
using Adaption;
using System.Drawing;
using System.Reflection;
using PCA;

namespace New_Project
{
    static class Program
    {
        [STAThread]
        static void Main()
        {
            //Preparing rotation matrix 30 deg. counterclockwise
            double angle = Math.PI/180 * 30; //~= 0.52...
            DoubleMatrix rotationMatrix = PrepareRotationMatrix(angle);
            //rotationMatrix seems like:
            //  cos(30) -sin(30)
            //  sin(30)  cos(30)

            //This will create 2x3 matrix of zeroes
            DoubleMatrix matrix1 = new DoubleMatrix(2, 3);
            //Lets fill some data here too
            matrix1[0, 0] = 1;
            matrix1[0, 1] = 2;
            matrix1[0, 2] = 3;
            matrix1[1, 0] = 4;
            matrix1[1, 1] = 5;
            matrix1[1, 2] = 6;
            //matrix1 looks like this:
            //         1 2 3
            //         4 5 6

            //Creating a copy of the 1st matrix , this copy will be rotated
            DoubleMatrix matrix2 = new DoubleMatrix((Matrix<double>)matrix1.Clone()); 
            //now rotating it
            matrix2 = rotationMatrix * matrix2;
            //matrix2 looks like this: (same form but rotated 30 deg. c.clockwise
            //       4.10   5.24   6.39
            //      -0.37  -1.20  -2.03

            //Now we'll see that the data of both matrices has same PCA properties
            
            //Creating PCA transform instances for both martices
            PCAtransform transform1 = new PCAtransform(matrix1);
            PCAtransform transform2 = new PCAtransform(matrix2);
            //The result from calculate is a MxM dims eigenVectors matrix
            DoubleMatrix eigenVects1 = transform1.Calculate();
            DoubleMatrix eigenVects2 = transform2.Calculate();

            //It is enough for us to get the absolute 
            //angle of the first vector (the first column)
            double absAngle1 = Math.Atan(eigenVects1[1, 0] / eigenVects1[0, 0]);
            double absAngle2 = Math.Atan(eigenVects2[1, 0] / eigenVects2[0, 0]);

            //Lets see that the relative angle is the same 
            //as the rotation angle from the beginning
            double relativeAngle = absAngle2 - absAngle1; //~=0.52...
            // = -0.00000000000000011102230246251565 ~= 0 
            double anglesDiff = relativeAngle - angle;
            //Not exactly because of precision pervertion, 
            //but this is enought to get convinced

            //Another aspect of similarity is by eigenvalues 
            //(they represen the squares of the scaling factors):
            double[] eigenVals1 = transform1.EigenValues;
            //This array looks like this: 0,2
            double[] eigenVals2 = transform2.EigenValues;
            //This array looks like this: 0,2            
        }

        private static DoubleMatrix PrepareRotationMatrix(double angle)
        {
            DoubleMatrix rotationMatrix = new DoubleMatrix(2);
            rotationMatrix[0, 0] = Math.Cos(angle);
            rotationMatrix[0, 1] = -1 * Math.Sin(angle);
            rotationMatrix[1, 0] = Math.Sin(angle);
            rotationMatrix[1, 1] = Math.Cos(angle);
            return rotationMatrix;
        }
    }
}

The PCA transform is also good for data analysis and lossy compression. Since that was not the aim of my work, that will not be explained here. More information about such usages can be found in the internet. Suppose we only want to match a drawing or a coordinates set to a given source set; it can be more easily done using PCA matching. The data we use will be from the previous example.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;
using LiniarAlgebra;
using System.Drawing;
using System.Reflection;
using PCA;

namespace New_Project
{
    static class Program
    {
        [STAThread]
        static void Main()
        {
            //Preparing rotation matrix 30 deg. counterclockwise
            double angle = Math.PI / 180 * 30; //~= 0.52...
            DoubleMatrix rotationMatrix = PrepareRotationMatrix(angle);
            //rotationMatrix seems like:
            //  cos(30) -sin(30)
            //  sin(30)  cos(30)

            //This will create 2x3 matrix of zeroes
            DoubleMatrix matrix1 = new DoubleMatrix(2, 3);
            //Lets fill some data here too
            matrix1[0, 0] = 1;
            matrix1[0, 1] = 2;
            matrix1[0, 2] = 3;
            matrix1[1, 0] = 4;
            matrix1[1, 1] = 5;
            matrix1[1, 2] = 6;
            //matrix1 looks like this:
            //         1 2 3
            //         4 5 6


            //Creating a copy of the 1st matrix , this copy will be rotated
            DoubleMatrix matrix2 = new DoubleMatrix((Matrix<double>)matrix1.Clone());
            //now rotating it
            matrix2 = rotationMatrix * matrix2;
            //matrix2 looks like this: (same form but rotated 30 deg. c.clockwise
            //       4.10   5.24   6.39
            //      -0.37  -1.20  -2.03

            //Preparing and calculating the matching object
            PCAMatching matching = new PCAMatching(matrix1, matrix2);
            matching.Calculate();

            //Getting the transformed data of matrix2 
            //as much as possible closer to matrix1
            DoubleMatrix result = matching.Result;
            //The result is:
            //      1   2   3
            //      4   5   6

            double calculatedAngle = matching.AngleFromSource;
            //calculatedAngle = -0.5235987755982987 the opposite to the original rotation
        }
    }
}

Points of interest

This algorithm opened my mind in some ways. Before starting coding, I didn't imagine that there would be a matching algorithm that does the job with such great complexity (O(n)). Of course, it has its own disadvantages, but this is a good start.

This article and the included projects are part of the Shape-Matching framework that can be found at http://sites.google.com/site/smfmproject/. As you can see, with some additional work, it matches shapes even greater:

License

This article, along with any associated source code and files, is licensed under The MIT License

About the Author

yanirta
Software Developer
Israel Israel
A student for a first degree (BSC) in Computer Science.
Working as software developer in a local software company.

Comments and Discussions

 
GeneralPrincipal Component Analysis in C# [modified] PinmemberCésar de Souza30-Oct-09 17:51 
GeneralMy vote of 2 PinmemberGünther M. FOIDL24-Oct-09 5:33 
GeneralRe: My vote of 2 Pinmemberyanirta26-Oct-09 3:19 
The aim of the article as I see it , is to demonstrate usage of alignment that I built,
The algorithm is well known, it can be found in wikipedia and other free resources.
But if that wasn't enough, you are welcome to visit my dedicated website and download the latest solution with all the code, free and open.
 
Thanks.
Yanir
GeneralRe: My vote of 2 PinmemberGünther M. FOIDL31-Oct-09 14:29 
GeneralMy vote of 2 PinmvpDave Kreskowiak26-Sep-09 16:06 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140721.1 | Last Updated 26 Sep 2009
Article Copyright 2009 by yanirta
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid