Click here to Skip to main content
15,887,477 members
Articles / Programming Languages / C#

A C# Implementation of Douglas-Peucker Line Approximation Algorithm

Rate me:
Please Sign up or sign in to vote.
4.73/5 (24 votes)
6 Jun 2007MIT 156.8K   3.5K   60   39
DP Line approximation algorithm is a well-known method to approximate 2D lines. It is quite fast, O(nlog_2(n)) for a n-points line and can drastically compress a data curve. Here, a fully OOP implementation is given.

Introduction

I have found multiple implementations of the Douglas-Peucker algorithm but not in any .NET language, so I decided to port it over. Jonathan de Halleux has a wonderful explanation here.

Background

I needed to reduce polygons size to display on a map based on zoom levels.

Using the Code

The code included is complete and should run out of the box in Visual Studio 2005. If it does not, please let me know.

C#
/// <summary>
/// Uses the Douglas Peucker algorithm to reduce the number of points.
/// </summary>
/// <param name="Points">The points.</param>
/// <param name="Tolerance">The tolerance.</param>
/// <returns></returns>
public static List<Point> DouglasPeuckerReduction
    (List<Point> Points, Double Tolerance)
{
    if (Points == null || Points.Count < 3)
    return Points;

    Int32 firstPoint = 0;
    Int32 lastPoint = Points.Count - 1;
    List<Int32> pointIndexsToKeep = new List<Int32>();

    //Add the first and last index to the keepers
    pointIndexsToKeep.Add(firstPoint);
    pointIndexsToKeep.Add(lastPoint);

    //The first and the last point cannot be the same
    while (Points[firstPoint].Equals(Points[lastPoint]))
    {
        lastPoint--;
    }

    DouglasPeuckerReduction(Points, firstPoint, lastPoint, 
    Tolerance, ref pointIndexsToKeep);

    List<Point> returnPoints = new List<Point>();
    pointIndexsToKeep.Sort();
    foreach (Int32 index in pointIndexsToKeep)
    {
        returnPoints.Add(Points[index]);
    }

    return returnPoints;
}
    
/// <summary>
/// Douglases the peucker reduction.
/// </summary>
/// <param name="points">The points.</param>
/// <param name="firstPoint">The first point.</param>
/// <param name="lastPoint">The last point.</param>
/// <param name="tolerance">The tolerance.</param>
/// <param name="pointIndexsToKeep">The point index to keep.</param>
private static void DouglasPeuckerReduction(List<Point> 
    points, Int32 firstPoint, Int32 lastPoint, Double tolerance, 
    ref List<Int32> pointIndexsToKeep)
{
    Double maxDistance = 0;
    Int32 indexFarthest = 0;
    
    for (Int32 index = firstPoint; index < lastPoint; index++)
    {
        Double distance = PerpendicularDistance
            (points[firstPoint], points[lastPoint], points[index]);
        if (distance > maxDistance)
        {
            maxDistance = distance;
            indexFarthest = index;
        }
    }

    if (maxDistance > tolerance && indexFarthest != 0)
    {
        //Add the largest point that exceeds the tolerance
        pointIndexsToKeep.Add(indexFarthest);
    
        DouglasPeuckerReduction(points, firstPoint, 
        indexFarthest, tolerance, ref pointIndexsToKeep);
        DouglasPeuckerReduction(points, indexFarthest, 
        lastPoint, tolerance, ref pointIndexsToKeep);
    }
}

/// <summary>
/// The distance of a point from a line made from point1 and point2.
/// </summary>
/// <param name="pt1">The PT1.</param>
/// <param name="pt2">The PT2.</param>
/// <param name="p">The p.</param>
/// <returns></returns>
public static Double PerpendicularDistance
    (Point Point1, Point Point2, Point Point)
{
    //Area = |(1/2)(x1y2 + x2y3 + x3y1 - x2y1 - x3y2 - x1y3)|   *Area of triangle
    //Base = v((x1-x2)²+(x1-x2)²)                               *Base of Triangle*
    //Area = .5*Base*H                                          *Solve for height
    //Height = Area/.5/Base

    Double area = Math.Abs(.5 * (Point1.X * Point2.Y + Point2.X * 
    Point.Y + Point.X * Point1.Y - Point2.X * Point1.Y - Point.X * 
    Point2.Y - Point1.X * Point.Y));
    Double bottom = Math.Sqrt(Math.Pow(Point1.X - Point2.X, 2) + 
    Math.Pow(Point1.Y - Point2.Y, 2));
    Double height = area / bottom * 2;

    return height;
    
    //Another option
    //Double A = Point.X - Point1.X;
    //Double B = Point.Y - Point1.Y;
    //Double C = Point2.X - Point1.X;
    //Double D = Point2.Y - Point1.Y;
    
    //Double dot = A * C + B * D;
    //Double len_sq = C * C + D * D;
    //Double param = dot / len_sq;
    
    //Double xx, yy;
    
    //if (param < 0)
    //{
    //    xx = Point1.X;
    //    yy = Point1.Y;
    //}
    //else if (param > 1)
    //{
    //    xx = Point2.X;
    //    yy = Point2.Y;
    //}
    //else
    //{
    //    xx = Point1.X + param * C;
    //    yy = Point1.Y + param * D;
    //}
    
    //Double d = DistanceBetweenOn2DPlane(Point, new Point(xx, yy));
}

Points of Interest

The code is not overly complicated. It was fun to port this algorithm, since all I feel I do nowadays is hold business' hands to help them try and solve business problems that there is no consensus on.

History

  • Beta 1

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Web Developer
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionMore efficient version in Loyc.Math + Bug fix Pin
Qwertie22-Nov-23 9:03
Qwertie22-Nov-23 9:03 
The above code seems to have a bug in its PerpendicularDistance method. For example, consider
PerpendicularDistance(new Point(0, 0), new Point(0, 100), new Point(20, 199))

In order for the line-simplification algorithm to work correctly, this should return 101 (because the closest point to (20, 199) on the line segment is (0, 100), and the distance between those is 101). Instead, it returns 20, which is the what the distance to the line would be if the line were infinitely long.

The version in this article also has a couple of inefficiencies:
  • The algorithm itself is done in three stages: (1) figure out which indexes to keep, but out-of-order, (2) sort the list of indexes, (3) built the output list of points. But in fact, the Douglas-Peucker algorithm can be done in a single stage that directly outputs the list of points in the correct order
  • It's not necessary to compute the distance from each point to the reference line. It's more efficient to compute the quadrance instead (quadrance is the square of distance) to avoid square root calculations.
  • It uses Math.Pow(x, 2) ... I'm pretty sure it's faster to multiply x * x.
The Loyc.Math package on NuGet avoids the bug and these inefficiencies in its LineMath.SimplifyPolyline static methods (although to be honest, the modified distance-to-line calculation, while avoiding the bug I mentioned, does use a division, which is not ideal - let me know if you figure out how to avoid that).

The Loyc version is also generic (designed to be compatible with any kind of point and any collection type) because .NET has several different Point types (e.g. WinForms points, WPF points). However, Loyc.Math doesn't take any of those GUI libraries as a dependency and therefore doesn't support them directly.

Here's the core algorithm (which supports any point type):
public static int SimplifyPolyline<List, Point, T>(
    List points, int iStart, int iStop,
    ICollection<Point> output, T tolerance,
    Func<Point, Point, Point, T> distanceToLine)
    where List : IReadOnlyList<Point> where T : IComparable<T>
{
    int iLast = iStop - 1;
    if (iStart >= iLast) {
        if (iStart == iLast)
            output.Add(points[iStart]);

        return iStop - iStart;
    }

    // Run Douglas-Peucker algorithm
    int count = SimplifyRecursively(points, iStart, iLast, output, tolerance, distanceToLine);

    // The last point is not included by <code>SimplifyRecursively</code>, so add it afterward.
    output.Add(points[iLast]);
    return count + 1;

    static int SimplifyRecursively(
        List points, int iFirst, int iLast, 
        ICollection<Point> output, T tolerance,
        Func<Point, Point, Point, T> distanceToLine)
    {
        Debug.Assert(iFirst < iLast);

        Point first = points[iFirst];
        int i = iFirst + 1;
        if (i < iLast) {
            Point last = points[iLast];
            T maxDist = tolerance;
            int iFarthest = -1;
            do {
                var dist = distanceToLine(points[i], first, last);
                if (maxDist.CompareTo(dist) < 0) {
                    maxDist = dist;
                    iFarthest = i;
                }
            } while (++i < iLast);

            if (iFarthest != -1) {
                int count = SimplifyRecursively(points, iFirst, iFarthest, output, tolerance, distanceToLine);
                count += SimplifyRecursively(points, iFarthest, iLast, output, tolerance, distanceToLine);
                return count;
            }
        }

        output.Add(first);
        return 1;
    }
}

(That's weird, CodeProject replaced `SimplifyRecursively` with <code>SimplifyRecursively</code> in a comment)

That method then needs to be specialized with a distance-to-line or quadrance-to-line calculation. For example, one specialization looks like this:
/// <summary>Simplifies a polyline using the Douglas-Peucker line 
///   simplification algorithm. This algorithm removes points that are 
///   deemed unimportant, so the output is a subset of the input.</summary>
/// <typeparam name="List">Original unsimplified polyline</typeparam>
/// <param name="output">The output polyline is added in order to this collection</param>
/// <param name="tolerance">The distance between the input polyline and the 
///   output polyline will never exceed this distance. Increase this value to 
///   simplify more aggressively.</param>
/// <returns>The number of output points.</returns>
/// <remarks>
///   The average time complexity of this algorithm is O(N log N). 
///   The worst-case time complexity is O(N^2).
/// </remarks>
public static int SimplifyPolyline<List>(List points, ICollection<Point<double>> output, double tolerance)
    where List : IReadOnlyList<Point<double>>
{
    return SimplifyPolyline(points, 0, points.Count, output, tolerance * tolerance, _quadranceToLineD);
}

public static List<Point<double>> SimplifyPolyline<List>(List points, double tolerance)
    where List : IReadOnlyList<Point<double>>
{
    var output = new List<Point<double>>();
    SimplifyPolyline(points, 0, points.Count, output, tolerance * tolerance, _quadranceToLineD);
    return output;
}

static readonly Func<Point<double>, Point<double>, Point<double>, double> _quadranceToLineD
    = (p, p0, p1) => (p - ProjectOnto(p, new LineSegment<double>(p0, p1), LineType.Segment, out _)).Quadrance();

But this is specialized for Point<double> which is not among the various standard .NET point types. If you want to use this code but make it compatible with some other kind of Point type (for example, System.Windows.Point), you'll need to copy the methods that the above code depends on (such as ProjectOnto() and Quadrance()), and change those other methods to meet your needs. For reference, here is the other code that the above code depends on:
public enum LineType { 
    Segment, // Finite line
    Ray,     // Line extended to infinity beyond end point
    Infinite // Line extended to infinity in both directions
}

/// <summary>Holds a 2D line segment.</summary>
/// <typeparam name="T">Coordinate type</typeparam>
public struct LineSegment<T> where T : IConvertible, IEquatable<T>
{
    public LineSegment(Point<T> a, Point<T> b) { A = a; B = b; }

    public Point<T> A, B;
    ...
}

// Meanwhile, in a different C# source file...
using T = System.Double;
using LineSegment = LineSegment<double>;
using Point = Point<double>;
using Vector = Vector<double>;
using System;

public static partial class LineMath
{
    /// <summary>Performs projection, which finds the point on a line segment 
    ///   or infinite line that is nearest to a specified point.</summary>
    /// <param name="seg">The line segment</param>
    /// <param name="p">The test point to be projected</param>
    /// <param name="type">Whether to treat the line segment as extended to infinite length.</param>
    /// <param name="end">Set to 0 if the point is on the line segment (including
    ///   one of the endpoints), -1 if the point is before seg.A, 1 if the point is 
    ///   after seg.B, and null if the line segment is degenerate (seg.A==seg.B)</param>
    /// <returns>The projected point.</returns>
    /// <remarks>
    ///   This algorithm is fast and accurate, and can be easily adapted to 3D.
    ///   A special advantage of this approach is that it runs fastest when the 
    ///   point is projected onto one of the endpoints (when infiniteLine is 
    /// false).
    /// <para/>
    ///   Algorithm comes from: <a href="http://geomalgorithms.com/a02-_lines.html">http://geomalgorithms.com/a02-_lines.html</a>
    ///   See section "Distance of a Point to a Ray or Segment"
    /// </remarks>
    public static Point ProjectOnto(this Point p, in LineSegment seg, LineType type, out int? end)
    {
        end = 0;
        Vector v = seg.Vector();
        Vector w = p.Sub(seg.A);
        T c1 = w.Dot(v); // c1 == |w|*|v|*cos(angle between them)
        if (c1 <= 0) { // angle between line segment and (p-seg.A) is negative (-180..0)?
            if (v.X == 0 && v.Y == 0) {
                // seg.A == seg.B
                end = null;
                return seg.A;
            } else if (c1 < 0)
                end = -1;
            if (type != LineType.Infinite)
                return seg.A;
        }
        T c2 = v.Quadrance(); // == |v|*|v|
        if (c1 >= c2) { // quadrance from seg.A to projected point >= quadrance of seg
            if (c1 > c2)
                end = 1;
            if (type == LineType.Segment)
                return seg.B;
        }
        if (c2 == 0) {
            // seg.A and seg.B are infitessimally close together; c2 was truncated to zero
            end = null;
            return seg.A;
        }

        T frac = c1 / c2;                         // == |w|/|v|*cos(angle)
        Point projected = seg.A.Add(v.Mul(frac)); // == p0 + v/|v|*|w|*cos(angle)
        return projected;
    }

    /// <summary>Returns seg.B - seg.A.</summary>
    public static Vector Vector(this in LineSegment seg) { return seg.B.Sub(seg.A); }
    ...
}

public static partial class PointMath
{
    public static T Quadrance(this Vector v) { return v.X*v.X + v.Y*v.Y; }

    public static Vector Sub(this Point a, Point b)   { return new Vector(a.X-b.X, a.Y-b.Y); }

    public static Point  Add(this Point a, Vector b)  { return new Point(a.X+b.X, a.Y+b.Y); }

    public static Vector Mul(this Vector p, T factor) { return new Vector(p.X*factor, p.Y*factor); }
}

(Yes, eventually I'll update these methods to support the new .NET Generic Numerics support in .NET 7, but for the time being it's still backward compatible all the way to .NET 4.5, even though the library is configured to use C# 10 which is much newer than .NET 4.5)

modified 22-Nov-23 17:14pm.

QuestionAddition to get rid of possible StackOverflow Exception Pin
Alexander Panfilenok12-Oct-23 4:08
Alexander Panfilenok12-Oct-23 4:08 
QuestionThanks! Pin
Keith Rule21-May-22 8:19
professionalKeith Rule21-May-22 8:19 
Questionrejecting recursion + distance calc acceleration Pin
adevder18-Feb-17 23:12
adevder18-Feb-17 23:12 
AnswerRe: rejecting recursion + distance calc acceleration Pin
TerrySpiderman21-Mar-17 2:05
professionalTerrySpiderman21-Mar-17 2:05 
Questionto save stack memory with great amount of points and tolerance near zero i prefer to reject recursion Pin
adevder18-Feb-17 0:22
adevder18-Feb-17 0:22 
PraiseNice Pin
veen_rp18-Apr-16 18:41
professionalveen_rp18-Apr-16 18:41 
BugIf (first = last) Bug! Pin
Giovani Luigi14-Dec-15 11:44
Giovani Luigi14-Dec-15 11:44 
GeneralRe: If (first = last) Bug! Pin
Keith Rule21-May-22 12:40
professionalKeith Rule21-May-22 12:40 
QuestionDouglasPeuckerN() conversion? Pin
cbordeman11-Dec-15 16:42
cbordeman11-Dec-15 16:42 
AnswerRe: DouglasPeuckerN() conversion? Pin
Mr Lorenz27-Apr-17 6:44
Mr Lorenz27-Apr-17 6:44 
GeneralUseful Pin
Dineshkumar_MCA22-Apr-15 1:18
professionalDineshkumar_MCA22-Apr-15 1:18 
QuestionBUG tolerance Pin
Member 1017584518-Feb-15 1:08
Member 1017584518-Feb-15 1:08 
QuestionThanks + pruned code Pin
Stoyan Haralampiev23-Oct-14 2:30
Stoyan Haralampiev23-Oct-14 2:30 
QuestionUsing this algorithm on data from database Pin
Saket Surya2-Jun-14 2:38
Saket Surya2-Jun-14 2:38 
QuestionPoint projected on the line but behind the segment? Bug? Pin
Sergiy Tkachuk25-Mar-14 13:31
Sergiy Tkachuk25-Mar-14 13:31 
QuestionAbout Tolerance Pin
kamlesh Prajapati8-Jan-14 22:55
kamlesh Prajapati8-Jan-14 22:55 
QuestionWhy do you create your own Point? Pin
leiyangge3-Jan-14 21:37
leiyangge3-Jan-14 21:37 
QuestionLicense Pin
Diego Catalano14-Dec-13 1:42
Diego Catalano14-Dec-13 1:42 
Question1 bug, + 1 potential bug Pin
gyuri1023-Apr-12 22:45
gyuri1023-Apr-12 22:45 
AnswerRe: 1 bug, + 1 potential bug Pin
tplokas26-Mar-13 13:46
tplokas26-Mar-13 13:46 
QuestionUseful,Thanks! Pin
masy19112-Mar-12 22:43
masy19112-Mar-12 22:43 
QuestionGreat Job! Pin
Christopher Drake19-Dec-11 7:32
Christopher Drake19-Dec-11 7:32 
Generalhow calculate an epsilon value Pin
unravel-the-sky23-May-11 5:52
unravel-the-sky23-May-11 5:52 
GeneralSome Issues with your implementation Pin
MattMi15-Jul-10 10:57
MattMi15-Jul-10 10:57 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.