Click here to Skip to main content
Click here to Skip to main content

Tagged as

Go to top

Least Squares Regression for Quadratic Curve Fitting

, 9 Mar 2010
Rate this:
Please Sign up or sign in to vote.
A C# class for Least Squares Regression for Quadratic Curve Fitting.

Introduction

A recent software project had a requirement to derive the equation of a quadratic curve from a series of data points. That is to say, to determine a, b, and c, where y = ax2 + bx + c. Having determined a, b, and c, I would also need a value for R-squared (the coefficient of determination).

A quick search of Google failed to bring up a suitable C# class; quite possibly, a more thorough search would have done so.

I had a vague recollection of something called 'Least Squares Regression', so back to Google I went.

References

Least Squares Regression:

R square:

Cramer's rule:

Using the class

Declare an instance:

LstSquQuadRegr solvr = new LstSquQuadRegr();

Pass it some point pairs (at least 3):

solvr.AddPoints(x1, y1);
solvr.AddPoints(x2, y2);
solvr.AddPoints(x3, y3);
solvr.AddPoints(x4, y4);

Get the values:

double the_a_term = solvr.aTerm();  
double the_b_term = solvr.bTerm();
double the_c_term = solvr.cTerm();
double the_rSquare = solvr.rSquare();

The Theory

y = ax^2 + bx + c

We have a series of points (x1,y1), (x2,y2) ... (xn,yn).

for i = 1 to n

We want the values of a, b, and c that minimise the sum of squares of the deviations of yi from a*xi^2 + bxi + c. Such values will give the best-fitting quadratic equation.

Let the sum of the squares of the deviations be:

               n
   F(a,b,c) = SUM (a*xi^2 + bxi + c - yi)^2.
              i=1

dF/da = SUM 2*(a*xi^2+b*xi+c-yi)*xi^2 = 0,
dF/db = SUM 2*(a*xi^2+b*xi+c-yi)*xi = 0,
dF/dc = SUM 2*(a*xi^2+b*xi+c-yi) = 0.

(Here, all sums range over i = 1, 2, ..., n.) Dividing by 2 and rearranging gives these three simultaneous linear equations containing the three unknowns a, b, and c:

(SUM xi^4)*a + (SUM xi^3)*b + (SUM xi^2)*c = SUM xi^2*yi,
(SUM xi^3)*a + (SUM xi^2)*b +   (SUM xi)*c = SUM xi*yi,
(SUM xi^2)*a +   (SUM xi)*b +    (SUM 1)*c = SUM yi.

Using notation Sjk to mean the sum of x_i^j*y_i^k:

a*S40 + b*S30 + c*S20 = S21
a*S30 + b*S20 + c*S10 = S11
a*S20 + b*S10 + c*S00 = S01

Solve the simultaneous equations using Cramer's law:

  [ S40  S30  S20 ] [ a ]   [ S21 ]
  [ S30  S20  S10 ] [ b ] = [ S11 ]
  [ S20  S10  S00 ] [ c ]   [ S01 ] 

  D = [ S40  S30  S20 ] 
      [ S30  S20  S10 ] 
      [ S20  S10  S00 ]  
  
    = S40(S20*S00 - S10*S10) - S30(S30*S00 - S10*S20) + S20(S30*S10 - S20*S20)

 Da = [ S21  S30  S20 ]
      [ S11  S20  S10 ] 
      [ S01  S10  S00 ]  

    = S21(S20*S00 - S10*S10) - S11(S30*S00 - S10*S20) + S01(S30*S10 - S20*S20)

 Db = [ S40  S21  S20 ] 
      [ S30  S11  S10 ] 
      [ S20  S01  S00 ]  

    = S40(S11*S00 - S01*S10) - S30(S21*S00 - S01*S20) + S20(S21*S10 - S11*S20)
  
 Dc = [ S40  S30  S21 ] 
      [ S30  S20  S11 ] 
      [ S20  S10  S01 ] 
  
    = S40(S20*S01 - S10*S11) - S30(S30*S01 - S10*S21) + S20(S30*S11 - S20*S21)  

a = Da/D
b = Db/D
c = Dc/D

R square

R2 = 1 - (residual sum of squares / total sum of squares).

                        n
total sum of squares = SUM (yi - y_mean)^2.
                       i=1

This is the sum of the squares of the differences between the measured y values and the mean y value.

                            n
 residual sum of squares = SUM (yi - yi_predicted)^2.
                           i=1

This is the sum of the squares of the difference between the measured y values and the values of y predicted by the equation.

The Code

A bunch of helper methods calculate all the various sums of squares. When calculating the values of a, b, and c, I used the sjk notation above as I found it easier to keep track.

/******************************************************************************
                          Class LstSquQuadRegr
     A C#  Class for Least Squares Regression for Quadratic Curve Fitting
                          Alex Etchells  2010    
******************************************************************************/

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;


public  class LstSquQuadRegr
{
     /* instance variables */
    ArrayList pointArray = new ArrayList(); 
    private int numOfEntries; 
    private double[] pointpair;          

    /*constructor */
    public LstSquQuadRegr()
    {
        numOfEntries = 0;
        pointpair = new double[2];
    }

    /*instance methods */    
    /// <summary>
    /// add point pairs
    /// </summary>
    /// <param name="x">x value</param>
    /// <param name="y">y value</param>
    public void AddPoints(double x, double y) 
    {
        pointpair = new double[2]; 
        numOfEntries +=1; 
        pointpair[0] = x; 
        pointpair[1] = y;
        pointArray.Add(pointpair);
    }

    /// <summary>
    /// returns the a term of the equation ax^2 + bx + c
    /// </summary>
    /// <returns>a term</returns>
    public double aTerm()
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException(
               "Insufficient pairs of co-ordinates");
        }
        //notation sjk to mean the sum of x_i^j*y_i^k. 
        double s40 = getSx4(); //sum of x^4
        double s30 = getSx3(); //sum of x^3
        double s20 = getSx2(); //sum of x^2
        double s10 = getSx();  //sum of x
        double s00 = numOfEntries;
        //sum of x^0 * y^0  ie 1 * number of entries

        double s21 = getSx2y(); //sum of x^2*y
        double s11 = getSxy();  //sum of x*y
        double s01 = getSy();   //sum of y

        //a = Da/D
        return (s21*(s20 * s00 - s10 * s10) - 
                s11*(s30 * s00 - s10 * s20) + 
                s01*(s30 * s10 - s20 * s20))
                /
                (s40*(s20 * s00 - s10 * s10) -
                 s30*(s30 * s00 - s10 * s20) + 
                 s20*(s30 * s10 - s20 * s20));
    }

    /// <summary>
    /// returns the b term of the equation ax^2 + bx + c
    /// </summary>
    /// <returns>b term</returns>
    public double bTerm()
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException(
               "Insufficient pairs of co-ordinates");
        }
        //notation sjk to mean the sum of x_i^j*y_i^k.
        double s40 = getSx4(); //sum of x^4
        double s30 = getSx3(); //sum of x^3
        double s20 = getSx2(); //sum of x^2
        double s10 = getSx();  //sum of x
        double s00 = numOfEntries;
        //sum of x^0 * y^0  ie 1 * number of entries

        double s21 = getSx2y(); //sum of x^2*y
        double s11 = getSxy();  //sum of x*y
        double s01 = getSy();   //sum of y

        //b = Db/D
        return (s40*(s11 * s00 - s01 * s10) - 
                s30*(s21 * s00 - s01 * s20) + 
                s20*(s21 * s10 - s11 * s20))
                /
                (s40 * (s20 * s00 - s10 * s10) - 
                 s30 * (s30 * s00 - s10 * s20) + 
                 s20 * (s30 * s10 - s20 * s20));
    }

    /// <summary>
    /// returns the c term of the equation ax^2 + bx + c
    /// </summary>
    /// <returns>c term</returns>
    public double cTerm()
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException(
                       "Insufficient pairs of co-ordinates");
        }
        //notation sjk to mean the sum of x_i^j*y_i^k.
        double s40 = getSx4(); //sum of x^4
        double s30 = getSx3(); //sum of x^3
        double s20 = getSx2(); //sum of x^2
        double s10 = getSx();  //sum of x
        double s00 = numOfEntries;
        //sum of x^0 * y^0  ie 1 * number of entries

        double s21 = getSx2y(); //sum of x^2*y
        double s11 = getSxy();  //sum of x*y
        double s01 = getSy();   //sum of y

        //c = Dc/D
        return (s40*(s20 * s01 - s10 * s11) - 
                s30*(s30 * s01 - s10 * s21) + 
                s20*(s30 * s11 - s20 * s21))
                /
                (s40 * (s20 * s00 - s10 * s10) - 
                 s30 * (s30 * s00 - s10 * s20) + 
                 s20 * (s30 * s10 - s20 * s20));
    }
    
    public double rSquare() // get r-squared
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException(
               "Insufficient pairs of co-ordinates");
        }
        // 1 - (residual sum of squares / total sum of squares)
        return 1 - getSSerr() / getSStot();
    }
   

    /*helper methods*/
    private double getSx() // get sum of x
    {
        double Sx = 0;
        foreach (double[] ppair in pointArray)
        {
            Sx += ppair[0];
        }
        return Sx;
    }

    private double getSy() // get sum of y
    {
        double Sy = 0;
        foreach (double[] ppair in pointArray)
        {
            Sy += ppair[1];
        }
        return Sy;
    }

    private double getSx2() // get sum of x^2
    {
        double Sx2 = 0;
        foreach (double[] ppair in pointArray)
        {
            Sx2 += Math.Pow(ppair[0], 2); // sum of x^2
        }
        return Sx2;
    }

    private double getSx3() // get sum of x^3
    {
        double Sx3 = 0;
        foreach (double[] ppair in pointArray)
        {
            Sx3 += Math.Pow(ppair[0], 3); // sum of x^3
        }
        return Sx3;
    }

    private double getSx4() // get sum of x^4
    {
        double Sx4 = 0;
        foreach (double[] ppair in pointArray)
        {
            Sx4 += Math.Pow(ppair[0], 4); // sum of x^4
        }
        return Sx4;
    }

    private double getSxy() // get sum of x*y
    {
        double Sxy = 0;
        foreach (double[] ppair in pointArray)
        {
            Sxy += ppair[0] * ppair[1]; // sum of x*y
        }
        return Sxy;
    }

    private double getSx2y() // get sum of x^2*y
    {
        double Sx2y = 0;
        foreach (double[] ppair in pointArray)
        {
            Sx2y += Math.Pow(ppair[0], 2) * ppair[1]; // sum of x^2*y
        }
        return Sx2y;
    }

    private double getYMean() // mean value of y
    {
        double y_tot = 0;
        foreach (double[] ppair in pointArray)
        {
            y_tot += ppair[1]; 
        }
        return y_tot/numOfEntries;
    }

    private double getSStot() // total sum of squares
    {
        //the sum of the squares of the differences between 
        //the measured y values and the mean y value
        double ss_tot = 0;
        foreach (double[] ppair in pointArray)
        {
            ss_tot += Math.Pow(ppair[1] - getYMean(), 2);
        }
        return ss_tot;
    }

    private double getSSerr() // residual sum of squares
    {
        //the sum of the squares of te difference between 
        //the measured y values and the values of y predicted by the equation
        double ss_err = 0;
        foreach (double[] ppair in pointArray)
        {
            ss_err += Math.Pow(ppair[1] - getPredictedY(ppair[0]), 2);
        }
        return ss_err;
    }

    private double getPredictedY(double x)
    {
        //returns value of y predicted by the equation for a given value of x
        return aTerm() * Math.Pow(x, 2) + bTerm() * x + cTerm();
    }
}

Points of Interest

That's it really - it seems to agree pretty closely with the values given by Excel. I hope it saves someone else the bother of working it out.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Alex@UEA
Software Developer University Of East Anglia, UK
United Kingdom United Kingdom
No Biography provided

Comments and Discussions

 
QuestionIncorrect result of lat/log pairs with 8 decimal point precision PinmemberMember 1094985924-Jul-14 1:35 
AnswerRe: Incorrect result of lat/log pairs with 8 decimal point precision PinmemberAlex@UEA24-Jul-14 1:44 
GeneralRe: Incorrect result of lat/log pairs with 8 decimal point precision [modified] PinmemberMember 1094985925-Jul-14 2:14 
QuestionThank you Pinmemberlolaparra806-Feb-14 23:17 
GeneralSingle C# static function version PinmemberJuraj Lutisan5-May-13 23:43 
GeneralRe: Single C# static function version PinmemberAlex@UEA6-May-13 21:32 
GeneralRe: Single C# static function version PinmemberJuraj Lutisan6-May-13 23:52 
QuestionC++ version PinmemberRobin Imrie8-Nov-12 0:33 
GeneralMy vote of 5 PinmemberNicoD22-May-12 0:18 
GeneralMy vote of 5 Pinmembernaeem_libra25-Dec-11 4:56 
QuestionNice work, but a suggestion: Pinmemberm00se123428-Nov-11 22:18 
AnswerRe: Nice work, but a suggestion: PinmemberAlex@UEA28-Nov-11 22:57 
Generalincorrect results PinmemberPrcy9-Dec-10 11:14 
GeneralRe: incorrect results PinmemberAlex@UEA10-Dec-10 0:23 
thanks for pointing that one out Prcy!
 
(incidentally your debug lines as written are returning the C term although your illustrative values are the A term.)
 
I have taken the approach of using decimals and using multiplication instead of Math.Pow.
This also gave me the opportunity to add a generic getSxy(int xPower, int yPower) // get sum of x^xPower * y^yPower method - something which had bugged me before
 

I want to slot this new version straight into those projects which already use this class,
so data is still entered and returned as doubles.
 
Here is the updated version
/******************************************************************************
                          Class LstSquQuadRegr
     A C#  Class for Least Squares Regression for Quadratic Curve Fitting
                          Alex Etchells  2010    
          version 2 using decimals to resolve rounding errors
******************************************************************************/
 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
 

public  class LstSquQuadRegr
{
     /* instance variables */
    ArrayList pointArray = new ArrayList(); 
    private int numOfEntries; 
    private decimal[] pointpair;
 
    /*constructor */
    public LstSquQuadRegr()
    {
        numOfEntries = 0;
        pointpair = new decimal[2];
    }
 
    /*instance methods */
    public void AddPoints(double x, double y) 
    {
        pointpair = new decimal[2]; 
        numOfEntries +=1; 
        pointpair[0] = Convert.ToDecimal(x);
        pointpair[1] = Convert.ToDecimal(y);
        pointArray.Add(pointpair);
    }
 
/*
  y = ax^2 + bx + c
 
  notation Sjk to mean the sum of x_i^j*y_i^k. 
  2a*S40 + 2c*S20 + 2b*S30 - 2*S21 = 0
  2b*S20 + 2a*S30 + 2c*S10 - 2*S11 = 0
  2a*S20 + 2c*S00 + 2b*S10 - 2*S01 = 0
  
  solve the system of simultaneous equations using Cramer's law
 
  [ S40  S30  S20 ] [ a ]   [ S21 ]
  [ S30  S20  S10 ] [ b ] = [ S11 ]
  [ S20  S10  S00 ] [ c ]   [ S01 ] 
 
  D = [ S40  S30  S20 ] 
      [ S30  S20  S10 ] 
      [ S20  S10  S00 ]  
  
      S40(S20*S00 - S10*S10) - S30(S30*S00 - S10*S20) + S20(S30*S10 - S20*S20)
 
 Da = [ S21  S30  S20 ]
      [ S11  S20  S10 ] 
      [ S01  S10  S00 ]  
 
      S21(S20*S00 - S10*S10) - S11(S30*S00 - S10*S20) + S01(S30*S10 - S20*S20)
 
 Db = [ S40  S21  S20 ] 
      [ S30  S11  S10 ] 
      [ S20  S01  S00 ]  
 
      S40(S11*S00 - S01*S10) - S30(S21*S00 - S01*S20) + S20(S21*S10 - S11*S20)
  
 Dc = [ S40  S30  S21 ] 
      [ S30  S20  S11 ] 
      [ S20  S10  S01 ] 
  
      S40(S20*S01 - S10*S11) - S30(S30*S01 - S10*S21) + S20(S30*S11 - S20*S21)  
 
 */
 
    private decimal deciATerm()
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException("Insufficient pairs of co-ordinates");
        }
        //notation sjk to mean the sum of x_i^j*y_i^k. 
        decimal s40 = getSxy(4, 0);
        decimal s30 = getSxy(3, 0);
        decimal s20 = getSxy(2, 0);
        decimal s10 = getSxy(1, 0);
        decimal s00 = numOfEntries;
 
        decimal s21 = getSxy(2, 1);
        decimal s11 = getSxy(1, 1);
        decimal s01 = getSxy(0, 1);
 
        //   Da / D
        return (s21 * (s20 * s00 - s10 * s10) - s11 * (s30 * s00 - s10 * s20) + s01 * (s30 * s10 - s20 * s20))
                /
                (s40 * (s20 * s00 - s10 * s10) - s30 * (s30 * s00 - s10 * s20) + s20 * (s30 * s10 - s20 * s20));
    }
 
    public double aTerm()
    {
        return Convert.ToDouble(deciATerm());
    }
 
    private decimal deciBTerm()
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException("Insufficient pairs of co-ordinates");
        }
        //notation sjk to mean the sum of x_i^j*y_i^k. 
        decimal s40 = getSxy(4, 0);
        decimal s30 = getSxy(3, 0);
        decimal s20 = getSxy(2, 0);
        decimal s10 = getSxy(1, 0);
        decimal s00 = numOfEntries;
 
        decimal s21 = getSxy(2, 1);
        decimal s11 = getSxy(1, 1);
        decimal s01 = getSxy(0, 1);
 
        //   Db / D
        return (s40 * (s11 * s00 - s01 * s10) - s30 * (s21 * s00 - s01 * s20) + s20 * (s21 * s10 - s11 * s20))
        /
        (s40 * (s20 * s00 - s10 * s10) - s30 * (s30 * s00 - s10 * s20) + s20 * (s30 * s10 - s20 * s20));
    }
 
    public double bTerm()
    {
       return Convert.ToDouble(deciBTerm());
    }
 
    private decimal deciCTerm()
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException("Insufficient pairs of co-ordinates");
        }
        //notation sjk to mean the sum of x_i^j*y_i^k.  
        decimal s40 = getSxy(4, 0);
        decimal s30 = getSxy(3, 0);
        decimal s20 = getSxy(2, 0);
        decimal s10 = getSxy(1, 0);
        decimal s00 = numOfEntries;
 
        decimal s21 = getSxy(2, 1);
        decimal s11 = getSxy(1, 1);
        decimal s01 = getSxy(0, 1);
 
        //   Dc / D
        return(s40*(s20 * s01 - s10 * s11) - s30*(s30 * s01 - s10 * s21) + s20*(s30 * s11 - s20 * s21))
                /
                (s40 * (s20 * s00 - s10 * s10) - s30 * (s30 * s00 - s10 * s20) + s20 * (s30 * s10 - s20 * s20));
    }
 
    public double cTerm()
    {
        return Convert.ToDouble(deciCTerm());
    }
    
    public double rSquare() // get rsquare
    {
        if (numOfEntries < 3)
        {
            throw new InvalidOperationException("Insufficient pairs of co-ordinates");
        }
        return Convert.ToDouble(1 - getSSerr() / getSStot());
    }
   
 
    /*helper methods*/
    private decimal getSxy(int xPower, int yPower) // get sum of x^xPower * y^yPower
    {
        decimal Sxy = 0;
        foreach (decimal[] ppair in pointArray)
        {
            decimal xToPower = 1;
            for (int i = 0; i < xPower; i++)
            {
                xToPower = xToPower * ppair[0];
            }
            
            decimal yToPower = 1;
            for (int i = 0; i < yPower; i++)
            {
                yToPower = yToPower * ppair[1];
            }
            Sxy += xToPower * yToPower;
        }
        return Sxy;
    }
 

    private decimal getYMean()
    {
        decimal y_tot = 0;
        foreach (decimal[] ppair in pointArray)
        {
            y_tot += ppair[1]; 
        }
        return y_tot/numOfEntries;
    }
 
    private decimal getSStot()
    {
        decimal ss_tot = 0;
        foreach (decimal[] ppair in pointArray)
        {
            ss_tot += (ppair[1] - getYMean()) *  (ppair[1] - getYMean());
        }
        return ss_tot;
    }
 
    private decimal getSSerr()
    {
        decimal ss_err = 0;
        foreach (decimal[] ppair in pointArray)
        {
            ss_err += (ppair[1] - getPredictedY(ppair[0])) * (ppair[1] - getPredictedY(ppair[0]));
        }
        return ss_err;
    }
 

    private decimal getPredictedY(decimal x)
    {
        return deciATerm() * x * x + deciBTerm() * x + deciCTerm();
    }
}
 
 
cheers Alex
GeneralMe Like! PinmemberJohn Devron27-Sep-10 8:03 
Generallooks nice but high order polynomial is better Pinmemberlastguy30-Aug-10 14:38 
GeneralRe: looks nice but high order polynomial is better [modified] PinmemberAlex@UEA6-Oct-10 23:27 
GeneralGood job PinmvpPete O'Hanlon4-Mar-10 20:57 
GeneralI've been looking for one of these PinmemberDon Kackman4-Mar-10 10:33 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140916.1 | Last Updated 9 Mar 2010
Article Copyright 2010 by Alex@UEA
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid