12,550,545 members (45,689 online)
alternative version

60K views
42 bookmarked
Posted

# Least Squares Regression for Quadratic Curve Fitting

, 9 Mar 2010 CPOL
 Rate this:
A C# class for Least Squares Regression for Quadratic Curve Fitting.

## Introduction

A recent software project had a requirement to derive the equation of a quadratic curve from a series of data points. That is to say, to determine a, b, and c, where y = ax2 + bx + c. Having determined a, b, and c, I would also need a value for R-squared (the coefficient of determination).

A quick search of Google failed to bring up a suitable C# class; quite possibly, a more thorough search would have done so.

I had a vague recollection of something called 'Least Squares Regression', so back to Google I went.

## References

Least Squares Regression:

R square:

Cramer's rule:

## Using the class

Declare an instance:

`LstSquQuadRegr solvr = new LstSquQuadRegr();`

Pass it some point pairs (at least 3):

```solvr.AddPoints(x1, y1);

Get the values:

```double the_a_term = solvr.aTerm();
double the_b_term = solvr.bTerm();
double the_c_term = solvr.cTerm();
double the_rSquare = solvr.rSquare();```

## The Theory

`y = ax^2 + bx + c`

We have a series of points (x1,y1), (x2,y2) ... (xn,yn).

`for i = 1 to n`

We want the values of a, b, and c that minimise the sum of squares of the deviations of yi from a*xi^2 + bxi + c. Such values will give the best-fitting quadratic equation.

Let the sum of the squares of the deviations be:

```               n
F(a,b,c) = SUM (a*xi^2 + bxi + c - yi)^2.
i=1

dF/da = SUM 2*(a*xi^2+b*xi+c-yi)*xi^2 = 0,
dF/db = SUM 2*(a*xi^2+b*xi+c-yi)*xi = 0,
dF/dc = SUM 2*(a*xi^2+b*xi+c-yi) = 0.```

(Here, all sums range over i = 1, 2, ..., n.) Dividing by 2 and rearranging gives these three simultaneous linear equations containing the three unknowns a, b, and c:

```(SUM xi^4)*a + (SUM xi^3)*b + (SUM xi^2)*c = SUM xi^2*yi,
(SUM xi^3)*a + (SUM xi^2)*b +   (SUM xi)*c = SUM xi*yi,
(SUM xi^2)*a +   (SUM xi)*b +    (SUM 1)*c = SUM yi.```

Using notation Sjk to mean the sum of x_i^j*y_i^k:

```a*S40 + b*S30 + c*S20 = S21
a*S30 + b*S20 + c*S10 = S11
a*S20 + b*S10 + c*S00 = S01```

Solve the simultaneous equations using Cramer's law:

```  [ S40  S30  S20 ] [ a ]   [ S21 ]
[ S30  S20  S10 ] [ b ] = [ S11 ]
[ S20  S10  S00 ] [ c ]   [ S01 ]

D = [ S40  S30  S20 ]
[ S30  S20  S10 ]
[ S20  S10  S00 ]

= S40(S20*S00 - S10*S10) - S30(S30*S00 - S10*S20) + S20(S30*S10 - S20*S20)

Da = [ S21  S30  S20 ]
[ S11  S20  S10 ]
[ S01  S10  S00 ]

= S21(S20*S00 - S10*S10) - S11(S30*S00 - S10*S20) + S01(S30*S10 - S20*S20)

Db = [ S40  S21  S20 ]
[ S30  S11  S10 ]
[ S20  S01  S00 ]

= S40(S11*S00 - S01*S10) - S30(S21*S00 - S01*S20) + S20(S21*S10 - S11*S20)

Dc = [ S40  S30  S21 ]
[ S30  S20  S11 ]
[ S20  S10  S01 ]

= S40(S20*S01 - S10*S11) - S30(S30*S01 - S10*S21) + S20(S30*S11 - S20*S21)

a = Da/D
b = Db/D
c = Dc/D```

### R square

R2 = 1 - (residual sum of squares / total sum of squares).

```                        n
total sum of squares = SUM (yi - y_mean)^2.
i=1```

This is the sum of the squares of the differences between the measured y values and the mean y value.

```                           n
residual sum of squares = SUM (yi - yi_predicted)^2.
i=1
```

This is the sum of the squares of the difference between the measured y values and the values of y predicted by the equation.

## The Code

A bunch of helper methods calculate all the various sums of squares. When calculating the values of a, b, and c, I used the sjk notation above as I found it easier to keep track.

```/******************************************************************************
A C#  Class for Least Squares Regression for Quadratic Curve Fitting
Alex Etchells  2010
******************************************************************************/

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;

{
/* instance variables */
ArrayList pointArray = new ArrayList();
private int numOfEntries;
private double[] pointpair;

/*constructor */
{
numOfEntries = 0;
pointpair = new double[2];
}

/*instance methods */
/// <summary>
/// add point pairs
/// </summary>
/// <param name="x">x value</param>
/// <param name="y">y value</param>
public void AddPoints(double x, double y)
{
pointpair = new double[2];
numOfEntries +=1;
pointpair[0] = x;
pointpair[1] = y;
}

/// <summary>
/// returns the a term of the equation ax^2 + bx + c
/// </summary>
/// <returns>a term</returns>
public double aTerm()
{
if (numOfEntries < 3)
{
throw new InvalidOperationException(
"Insufficient pairs of co-ordinates");
}
//notation sjk to mean the sum of x_i^j*y_i^k.
double s40 = getSx4(); //sum of x^4
double s30 = getSx3(); //sum of x^3
double s20 = getSx2(); //sum of x^2
double s10 = getSx();  //sum of x
double s00 = numOfEntries;
//sum of x^0 * y^0  ie 1 * number of entries

double s21 = getSx2y(); //sum of x^2*y
double s11 = getSxy();  //sum of x*y
double s01 = getSy();   //sum of y

//a = Da/D
return (s21*(s20 * s00 - s10 * s10) -
s11*(s30 * s00 - s10 * s20) +
s01*(s30 * s10 - s20 * s20))
/
(s40*(s20 * s00 - s10 * s10) -
s30*(s30 * s00 - s10 * s20) +
s20*(s30 * s10 - s20 * s20));
}

/// <summary>
/// returns the b term of the equation ax^2 + bx + c
/// </summary>
/// <returns>b term</returns>
public double bTerm()
{
if (numOfEntries < 3)
{
throw new InvalidOperationException(
"Insufficient pairs of co-ordinates");
}
//notation sjk to mean the sum of x_i^j*y_i^k.
double s40 = getSx4(); //sum of x^4
double s30 = getSx3(); //sum of x^3
double s20 = getSx2(); //sum of x^2
double s10 = getSx();  //sum of x
double s00 = numOfEntries;
//sum of x^0 * y^0  ie 1 * number of entries

double s21 = getSx2y(); //sum of x^2*y
double s11 = getSxy();  //sum of x*y
double s01 = getSy();   //sum of y

//b = Db/D
return (s40*(s11 * s00 - s01 * s10) -
s30*(s21 * s00 - s01 * s20) +
s20*(s21 * s10 - s11 * s20))
/
(s40 * (s20 * s00 - s10 * s10) -
s30 * (s30 * s00 - s10 * s20) +
s20 * (s30 * s10 - s20 * s20));
}

/// <summary>
/// returns the c term of the equation ax^2 + bx + c
/// </summary>
/// <returns>c term</returns>
public double cTerm()
{
if (numOfEntries < 3)
{
throw new InvalidOperationException(
"Insufficient pairs of co-ordinates");
}
//notation sjk to mean the sum of x_i^j*y_i^k.
double s40 = getSx4(); //sum of x^4
double s30 = getSx3(); //sum of x^3
double s20 = getSx2(); //sum of x^2
double s10 = getSx();  //sum of x
double s00 = numOfEntries;
//sum of x^0 * y^0  ie 1 * number of entries

double s21 = getSx2y(); //sum of x^2*y
double s11 = getSxy();  //sum of x*y
double s01 = getSy();   //sum of y

//c = Dc/D
return (s40*(s20 * s01 - s10 * s11) -
s30*(s30 * s01 - s10 * s21) +
s20*(s30 * s11 - s20 * s21))
/
(s40 * (s20 * s00 - s10 * s10) -
s30 * (s30 * s00 - s10 * s20) +
s20 * (s30 * s10 - s20 * s20));
}

public double rSquare() // get r-squared
{
if (numOfEntries < 3)
{
throw new InvalidOperationException(
"Insufficient pairs of co-ordinates");
}
// 1 - (residual sum of squares / total sum of squares)
return 1 - getSSerr() / getSStot();
}

/*helper methods*/
private double getSx() // get sum of x
{
double Sx = 0;
foreach (double[] ppair in pointArray)
{
Sx += ppair[0];
}
return Sx;
}

private double getSy() // get sum of y
{
double Sy = 0;
foreach (double[] ppair in pointArray)
{
Sy += ppair[1];
}
return Sy;
}

private double getSx2() // get sum of x^2
{
double Sx2 = 0;
foreach (double[] ppair in pointArray)
{
Sx2 += Math.Pow(ppair[0], 2); // sum of x^2
}
return Sx2;
}

private double getSx3() // get sum of x^3
{
double Sx3 = 0;
foreach (double[] ppair in pointArray)
{
Sx3 += Math.Pow(ppair[0], 3); // sum of x^3
}
return Sx3;
}

private double getSx4() // get sum of x^4
{
double Sx4 = 0;
foreach (double[] ppair in pointArray)
{
Sx4 += Math.Pow(ppair[0], 4); // sum of x^4
}
return Sx4;
}

private double getSxy() // get sum of x*y
{
double Sxy = 0;
foreach (double[] ppair in pointArray)
{
Sxy += ppair[0] * ppair[1]; // sum of x*y
}
return Sxy;
}

private double getSx2y() // get sum of x^2*y
{
double Sx2y = 0;
foreach (double[] ppair in pointArray)
{
Sx2y += Math.Pow(ppair[0], 2) * ppair[1]; // sum of x^2*y
}
return Sx2y;
}

private double getYMean() // mean value of y
{
double y_tot = 0;
foreach (double[] ppair in pointArray)
{
y_tot += ppair[1];
}
return y_tot/numOfEntries;
}

private double getSStot() // total sum of squares
{
//the sum of the squares of the differences between
//the measured y values and the mean y value
double ss_tot = 0;
foreach (double[] ppair in pointArray)
{
ss_tot += Math.Pow(ppair[1] - getYMean(), 2);
}
return ss_tot;
}

private double getSSerr() // residual sum of squares
{
//the sum of the squares of te difference between
//the measured y values and the values of y predicted by the equation
double ss_err = 0;
foreach (double[] ppair in pointArray)
{
ss_err += Math.Pow(ppair[1] - getPredictedY(ppair[0]), 2);
}
return ss_err;
}

private double getPredictedY(double x)
{
//returns value of y predicted by the equation for a given value of x
return aTerm() * Math.Pow(x, 2) + bTerm() * x + cTerm();
}
}```

## Points of Interest

That's it really - it seems to agree pretty closely with the values given by Excel. I hope it saves someone else the bother of working it out.

## About the Author

 Software Developer University Of East Anglia, UK United Kingdom
No Biography provided

## Comments and Discussions

 First Prev Next
 Incorrect result of lat/log pairs with 8 decimal point precision Member 1094985924-Jul-14 1:35 Member 10949859 24-Jul-14 1:35
 Re: Incorrect result of lat/log pairs with 8 decimal point precision Alex@UEA24-Jul-14 1:44 Alex@UEA 24-Jul-14 1:44
 Re: Incorrect result of lat/log pairs with 8 decimal point precision Member 1094985925-Jul-14 2:14 Member 10949859 25-Jul-14 2:14
 Thank you lolaparra806-Feb-14 23:17 lolaparra80 6-Feb-14 23:17
 Single C# static function version Juraj Lutisan5-May-13 23:43 Juraj Lutisan 5-May-13 23:43
 Re: Single C# static function version Alex@UEA6-May-13 21:32 Alex@UEA 6-May-13 21:32
 Re: Single C# static function version Juraj Lutisan6-May-13 23:52 Juraj Lutisan 6-May-13 23:52
 Re: Single C# static function version Rado_24-Sep-15 7:21 Rado_ 24-Sep-15 7:21
 Re: Single C# static function version hankkauffmann9-Apr-16 11:04 hankkauffmann 9-Apr-16 11:04
 C++ version Robin Imrie8-Nov-12 0:33 Robin Imrie 8-Nov-12 0:33
 Re: C++ version PIntag24-Sep-15 7:46 PIntag 24-Sep-15 7:46
 Re: C++ version Robin Imrie23-Nov-15 22:25 Robin Imrie 23-Nov-15 22:25
 My vote of 5 NicoD22-May-12 0:18 NicoD 22-May-12 0:18
 My vote of 5 naeem_libra25-Dec-11 4:56 naeem_libra 25-Dec-11 4:56
 Nice work, but a suggestion: m00se123428-Nov-11 22:18 m00se1234 28-Nov-11 22:18
 Re: Nice work, but a suggestion: Alex@UEA28-Nov-11 22:57 Alex@UEA 28-Nov-11 22:57
 incorrect results Prcy9-Dec-10 11:14 Prcy 9-Dec-10 11:14
 Re: incorrect results Alex@UEA10-Dec-10 0:23 Alex@UEA 10-Dec-10 0:23
 Me Like! John Devron27-Sep-10 8:03 John Devron 27-Sep-10 8:03
 looks nice but high order polynomial is better lastguy30-Aug-10 14:38 lastguy 30-Aug-10 14:38
 Re: looks nice but high order polynomial is better [modified] Alex@UEA6-Oct-10 23:27 Alex@UEA 6-Oct-10 23:27
 lastguy wrote:looks nice but high order polynomial is better well that's me told! lastguy wrote:this one is 2nd order well so it is! clue's in the title mate. whilst I'm here - I'm still wondering why I wrote ```private double getSx2() // get sum of x^2 private double getSx3() // get sum of x^3 ``` etc, etc rather than ```private double getSx(int power) // get sum of x^power ``` or better still ```private double getSxy(int xPower, int yPower) // get sum of x^xPower * y^yPower ``` still as my late father always said: 'if it aint broke, don't fix it' modified on Monday, October 11, 2010 4:49 AM
 Good job Pete O'Hanlon4-Mar-10 20:57 Pete O'Hanlon 4-Mar-10 20:57
 I've been looking for one of these Don Kackman4-Mar-10 10:33 Don Kackman 4-Mar-10 10:33
 Last Visit: 31-Dec-99 18:00     Last Update: 23-Oct-16 11:15 Refresh 1

General    News    Suggestion    Question    Bug    Answer    Joke    Praise    Rant    Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.