12,830,433 members (47,571 online)
Technical Blog
Add your own
alternative version

#### Stats

33.5K views
2.9K downloads
24 bookmarked
Posted 18 Feb 2010

# Kernel Principal Component Analysis in C#

, 25 Feb 2010 CPOL
 Rate this:
Please Sign up or sign in to vote.
Kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non-linear mapping.

Kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non-linear mapping.

KPCA is an extension of PCA to non-linear distributions. Instead of directly doing a PCA, the original data points are mapped into a higher-dimensional (possibly infinite-dimensional) feature space defined by a (usually nonlinear) function φ through a mathematical process called the “kernel trick”.

### The Kernel trick

The kernel trick transforms any algorithm that solely depends on the dot product between two vectors. Wherever a dot product is used, it is replaced with the kernel function. Thus, a linear algorithm can easily be transformed into a non-linear algorithm. This non-linear algorithm is equivalent to the linear algorithm operating in the range space of φ. However, because kernels are used, the φ function is never explicitly computed. This is desirable, because the high-dimensional space may be infinite-dimensional (as is the case when the kernel is a Gaussian).

http://www.youtube.com/watch?v=3liCbRZPrZA

Interesting video by Udi Aharoni demonstrating how points that are not linearly separable in a 2D space can almost always be linearly separated in a higher dimensional (3D) space.

### Principal Components in Kernel Space

Like in PCA, the overall idea is to perform a transformation that will maximize the variance of the captured variables while minimizing the overall covariance between those variables. Using the kernel trick, the covariance matrix is substituted by the Kernel matrix and the analysis is carried analogously in feature space. An Eigen value decomposition is performed and the eigenvectors are sorted in ascending order of Eigen values, so those vectors may form a basis in feature space that explain most of the variance in the data on its first dimensions.

However, because the principal components are in feature space, we will not be directly performing dimensionality reduction. Suppose that the number of observations m exceeds the input dimensionality n. In linear PCA, we can find at most n nonzero Eigen values. On the other hand, using Kernel PCA we can find up to m nonzero Eigen values because we will be operating on a m x m kernel matrix.

A more detailed discussion can be seen in the works of Mika et al. (Kernel PCA and De-Noising in Feature Spaces) and Scholkopf et al. (Nonlinear Component Analysis as a Kernel Eigenvalue Problem). A more accessible explanation of the topic is also available in the work from Ian Fasel.

### The pre-image problem

In standard PCA it was possible to revert the transformation, retrieving the original data points and enabling the use of PCA for compression* and denoising. However, as the results provided by kernel PCA live in some high dimensional feature space,  they need not have pre-images in input space. In fact, even if it exists, it also need be not unique.

Thus, in KPCA, it is only possible to obtain an approximate pre-image for a given set of patterns projected on the feature space. This problem can (and has) been tackled by many approaches, some of them of iterative nature, but some closed-form solutions also exists. Typically those solutions make use of the fact that distances in feature space are related to distances in input space. Thus, those solutions try to achieve an optimal transformation that can embed those feature points in input space respecting those distances relationships.

One of the first solutions of this form were proposed by Kwok and Tsang in their paper “The Pre-Image Problem in Kernel Methods". A better approach is also described by Bakir on his thesis “Extensions to Kernel Dependency Estimation”, alongside with a nice comprehensive description on various other methods for handling the pre-image problem.

Note: actually the use of KPCA for compression is a bit limited, since one needs to store the original data set to compute transformations and its approximate reversion.

## Code Overview

Below is the main code behind KPCA. This is a direct, naive implementation of the algorithm that may not scale very well for very large datasets. It is implemented on top of the previous standard PCA code and thus on top of the AForge.NET Framework.

```/// <summary>Computes the Kernel Principal Component Analysis algorithm.</summary>
public override void Compute()
{
int rows = Source.GetLength(0);
int cols = Source.GetLength(1);

// Center (adjust) the source matrix
sourceCentered = adjust(Source);

// Create the Gram (Kernel) Matrix
double[,] K = new double[rows, rows];
for (int i = 0; i < rows; i++)
{
for (int j = i; j < rows; j++)
{
double k = kernel.Kernel(sourceCentered.GetRow(i), sourceCentered.GetRow(j));
K[i, j] = k; // Kernel matrix is symmetric
K[j, i] = k;
}
}

// Center the Gram (Kernel) Matrix
if (centerFeatureSpace)
K = centerKernel(K);

// Perform the Eigen Value Decomposition (EVD) of the Kernel matrix
EigenValueDecomposition evd = new EigenValueDecomposition(K);

// Gets the eigenvalues and corresponding eigenvectors
double[] evals = evd.RealEigenValues;
double[,] eigs = evd.EigenVectors;

// Sort eigen values and vectors in ascending order
int[] indices = new int[rows];
for (int i = 0; i < rows; i++) indices[i] = i;
Array.Sort(evals, indices, new AbsoluteComparer(true));
eigs = eigs.Submatrix(0, rows - 1, indices);

// Normalize eigenvectors
if (centerFeatureSpace)
{
for (int j = 0; j < rows; j++)
{
double eig = System.Math.Sqrt(System.Math.Abs(evals[j]));
for (int i = 0; i < rows; i++)
eigs[i, j] = eigs[i, j] / eig;
}
}

// Set analysis properties
this.SingularValues = new double[rows];
this.EigenValues = evals;
this.ComponentMatrix = eigs;

// Project the original data into principal component space
double[,] result = new double[rows, rows];
for (int i = 0; i < rows; i++)
for (int j = 0; j < rows; j++)
for (int k = 0; k < rows; k++)
result[i, j] += K[i, k] * eigs[k, j];

this.Result = result;

// Computes additional information about the analysis and creates the
//  object-oriented structure to hold the principal components found.
createComponents();
}

/// <summary>Projects a given matrix into the principal component space.</summary>
/// The matrix to be projected. The matrix should contain
/// variables as columns and observations of each variable as rows.
public override double[,] Transform(double[,] data, int components)
{
int rows = data.GetLength(0);
int cols = data.GetLength(1);
int N = sourceCentered.GetLength(0);

// Center the data
data = adjust(data);

// Create the Kernel matrix
double[,] K = new double[rows, N];
for (int i = 0; i < rows; i++)
for (int j = 0; j < N; j++)
K[i, j] = kernel.Kernel(data.GetRow(i), sourceCentered.GetRow(j));

// Center the Gram (Kernel) Matrix
if (centerFeatureSpace)
K = centerKernel(K);

// Project into the kernel principal components
double[,] result = new double[rows, components];
for (int i = 0; i < rows; i++)
for (int j = 0; j < components; j++)
for (int k = 0; k < rows; k++)
result[i, j] += K[i, k] * ComponentMatrix[k, j];

return result;
}
```

The main code behind the reversion of the projection. It is a direct, naive implementation of the closed-form solution algorithm as proposed by Kwok and Tsang. Currently it is completely not optimized, but it gives an better overview of the algorithm on its essential form. The source code available to download may be better optimized in the future.

```/// <span class="code-SummaryComment"><summary>
</span>```

## Using the Code

As with PCA, code usage is simple. First, choose a kernel implementing the IKernel interface then pass it to the KernelPrincipalComponentAnalysis constructor together with the data points.

```// Create a Gaussian kernel with sigma = 10.0
Gaussian kernel = new Gaussian(10.0);

// Creates a new Kernel Principal Component Analysis object
var kpca = new KernelPrincipalComponentAnalysis(data, kernel);
```

Then, simply call Compute() to perform the analysis. Once done, data about the analysis will be available through the object properties and transformations on additional data can be computed by calling Transform()

```// Compute the Kernel Principal Component Analysis
kpca.Compute();

// Perform the transformation of the data using two components
double[,] result = kpca.Transform(data, 2);
```

## Sample Application

To demonstrate the use of KPCA, I created a simple Windows Forms Application which performs simple statistical analysis and KPCA transformations. It can also perform projection reversion by solving the approximate pre-image problem.

On left: Scholkopf Toy Example. Middle: Kernel principal components detected by the KPCA. Right: The two first dimensions obtained by the two first Kernel principal components.

Left: Wikipedia example. Middle: Projection using a non-centered Gaussian kernel. Right: Reconstruction of the original data using all Kernel principal components, showing some denoising ability.

#### Linear PCA as a special case

Additionally we may check that linear PCA is indeed a special case of kernel PCA by using a linear kernel on the example by Lindsay Smith on his A Tutorial On Principal Component Analysis.

## License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

## About the Author

 Engineer Xerox Research Center Europe Brazil
Computer and technology enthusiast, interested in artificial intelligence and image processing. Has a Master's degree on Computer Science specialized on Image and Signal Processing, with expertise on Machine Learning, Computer Vision, Pattern Recognition and Data Mining systems. Author of the Accord.NET Framework for developing scientific computing applications.

If you would like to hire good developers to build your dream application, please check out DaitanGroup, one of the top outsourcing companies in Brazil. This company, located in Brazil's Sillicon Valley but with US-based offices, has huge experience developing telecommunications software for large and small companies worldwide.

## Comments and Discussions

 First Prev Next
 Very good Member 1106892223-Oct-14 4:20 Member 11068922 23-Oct-14 4:20
 My vote of 5 Mahsa Hassankashi16-Apr-13 1:29 Mahsa Hassankashi 16-Apr-13 1:29
 ask about large data matrix Rafi1021-Feb-13 7:37 Rafi10 21-Feb-13 7:37
 Re: ask about large data matrix César de Souza21-Feb-13 7:51 César de Souza 21-Feb-13 7:51
 Broken link to sample + code cwp4225-Feb-10 0:34 cwp42 25-Feb-10 0:34
 Re: Broken link to sample + code César de Souza25-Feb-10 3:16 César de Souza 25-Feb-10 3:16
 Last Visit: 31-Dec-99 19:00     Last Update: 30-Mar-17 3:16 Refresh 1

General    News    Suggestion    Question    Bug    Answer    Joke    Praise    Rant    Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170326.1 | Last Updated 25 Feb 2010
Article Copyright 2010 by César de Souza
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid