13,140,231 members (47,282 online)
Tip/Trick
Add your own
alternative version

#### Stats

13K views
7 bookmarked
Posted 3 Jun 2013

# Standard Deviation Extension for Enumerable

, 7 Jun 2013
 Rate this:
Please Sign up or sign in to vote.
Calcution of a standard deviation and filtering outliers in a LINQ-style.

## Introduction

The following class provides two extensions to the .NET `Enumerable` class:

1. Standard deviation calculation.
2. Outlier removal using a k-sigma filter (which of course becomes a three-sigma rule for k=3).

See http://en.wikipedia.org/wiki/Three_sigma_rule for some basics. Please use the message board below to post suggestions or report bugs. Have fun!

## Using the code

Source code:

```using System;
using System.Collections.Generic;
using System.Linq;
public static class StandardDeviationEnumerableExtensions
{
/// <summary>
/// Calculates a standard deviation of elements, using a specified selector.
/// </summary>
public static double StandardDeviation<T>(
this IEnumerable<T> enumerable, Func<T, double> selector)
{
double sum = 0;
double average = enumerable.Average(selector);
int N = 0;
foreach (T item in enumerable)
{   double diff= selector(item) - average;
sum += diff*diff;
N++;
}
return N == 0 ? 0 : Math.Sqrt(sum / N);
}
/// <summary>
/// Filters elements to remove outliers. The enumeration will be
/// selected three times, first to calculate an average, second
/// for a standard deviation, and third to yield remiaining elements. The outliers are these
/// elements which are further from an average than k*(standard deviation). Set k=3 for
/// standard three-sigma rule.
/// </summary>
public static IEnumerable<T> SkipOutliers<T>(
this IEnumerable<T> enumerable, double k, Func<T, double> selector)
{
// Duplicating a SD code to avoid calculating an average twice.
double sum = 0;
double average = enumerable.Average(selector);
int N = 0;
foreach (T item in enumerable)
{   double diff = selector(item) - average;
sum += diff*diff;
N++;
}
double SD = N == 0 ? 0 : Math.Sqrt(sum / N);
double delta = k * SD;
foreach (T item in enumerable)
{
if (Math.Abs(selector(item) - average) <= delta)
yield return item;
}
}
}```

Usage:

```IEnumerable<double> results = new double[] { 1, 1.1, 1.2, 0.9, 2, 0.8 };
double[] filtered;
// contains all elements
filtered = results.SkipOutliers(k: 3, selector: result => result).ToArray();
// contains all elements except 2.0. That is, filtered={ 1, 1.1, 1.2, 0.9, 0.8 }
filtered = results.SkipOutliers(k: 2, selector: result => result).ToArray();
// contains just one element, 1.2, which is closest to an average. That is, filtered={ 1.2 }
filtered = results.SkipOutliers(k: 0.1, selector: result => result).ToArray();
// a singleton is always equal to it's average, so it's yielded even with k==0.
// That is, filtered={ 1.2 }
filtered = filtered.SkipOutliers(k: 0, selector: result => result).ToArray();```

So, with k parameter you can adjust how strict the filtering is. If k==0, then only those elements which are equal to an average are yielded. However, do not use k==0 because doubles should not be tested for equality in this way.

## History

• 2013-06-03 -- Original version posted.
• 2013-06-04 -- Possible unwanted division by zero bug-fix.

## License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

## About the Author

 Software Developer (Junior) Poland
My name is Jacek Gajek. I have graduated in computer science with a master's degree from Polibuda in Wrocław. I like C# and Monthy Python's sense of humour.

## Comments and Discussions

 First Prev Next
 Using incremental computation to avoid multiple passes Philippe Bouteleux6-Jun-13 2:02 Philippe Bouteleux 6-Jun-13 2:02
 Math.Pow(..., 2) is very inefficient Matt T Heffron4-Jun-13 16:13 Matt T Heffron 4-Jun-13 16:13
 Re: Math.Pow(..., 2) is very inefficient Jacek Gajek4-Jun-13 20:53 Jacek Gajek 4-Jun-13 20:53
 Update Jacek Gajek4-Jun-13 0:13 Jacek Gajek 4-Jun-13 0:13
 Possible divide by zero error? George Swan3-Jun-13 21:29 George Swan 3-Jun-13 21:29
 Re: Possible divide by zero error? Jacek Gajek3-Jun-13 23:16 Jacek Gajek 3-Jun-13 23:16
 Re: Possible divide by zero error? George Swan3-Jun-13 23:42 George Swan 3-Jun-13 23:42
 Re: Possible divide by zero error? Jacek Gajek4-Jun-13 0:05 Jacek Gajek 4-Jun-13 0:05
 Re: Possible divide by zero error? Paul R Benson6-Jun-13 21:01 Paul R Benson 6-Jun-13 21:01
 Last Visit: 31-Dec-99 18:00     Last Update: 19-Sep-17 22:30 Refresh 1

General    News    Suggestion    Question    Bug    Answer    Joke    Praise    Rant    Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.170915.1 | Last Updated 7 Jun 2013
Article Copyright 2013 by Jacek Gajek
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid