Click here to Skip to main content
13,733,345 members
Click here to Skip to main content
Add your own
alternative version

Stats

9K views
234 downloads
15 bookmarked
Posted 12 Mar 2017
Licenced CPOL

Implementation of Gauss–Newton Algorithm in Java

, 13 Mar 2017
Rate this:
Please Sign up or sign in to vote.
Implement Gauss-Newton algorithm in Java to solve non-linear least squares problems; i.e. to find minimum of a function.

Introduction

Gauss-Newton algorithm is a mathematical model to solve non-linear functions. A simple non-linear function is given below:

where a1 and a2 are the unknown parameters of this function. To find these two parameters, the values of y are measured on different values of x; y can be the rate of a chemical reaction and x is the concentration of a chemical affecting the rate. The results may look like this:

x0.0380.1940.4250.6261.2532.5003.740
y0.0500.1270.0940.21220.27290.26650.3317

Gauss-Newton algorithm is used to estimate the values of a1 and a2 using the observed data above.

This sample function and the data are taken from https://en.wikipedia.org/wiki/Gauss-Newton_algorithm.

Background

Gauss-Newton algorithm in a few steps:

  1. Create an array of unknown parameters in the function:

    b = (b1, b2,...,bn)

  2. Initialise the parameters and for each data point in matrix x, calculate the predicted values (y').
  3. Calculate the residuals:

    ri = y'i - yi

  4. Find the partial differential of residuals with respect to the parameters and generate Jacobian matrix.

  5. Following an iterative process, calculate the new values for the parameters using the following equation:

    s is the iteration number, J is the Jacobian matrix, JT is the transpose of J.

For more information, please visit https://en.wikipedia.org/wiki/Gauss-Newton_algorithm.

Using the Code

In this implementation, I will be using some matrix operations that have already been implemented in another article. For more information, please take a look at this article.

  1. Given matrix x and y, the optimization starts with an initial guess of b matrix. By default, a value of 1.0 is given to all parameters in non-linear function.
  2. Calculate the residuals:
    public double[][] calculateResiduals(double[][] x, double[] y, double[] b) {
         double[][] res = new double[y.length][1];
    
         for (int i = 0; i < res.length; i++) {
             res[i][0] = findY(x[i][0], b) - y[i];
         }
         return res;
    }

    For each data point in x matrix, function findY() is called to calculate the predicted value of y. In the code, you will find this function as abstract:

    public abstract double findY(double x, double[] b);

    The user needs to implement this function before using the optimization. For example, for the function in the introduction, the implementation will be as follows:

    GaussNewton gaussNewton = new GaussNewton() {
    
        @Override
        public double findY(double x, double[] b) {
            // y = (x * a1) / (a2 + x)
            return (x * b[0]) / (b[1] + x);
        }
    };
    
  3. Calculate the Jacobian matrix which is the partial derivatives of the residuals with respect to parameters in the function:
    public double[][] jacob(double[] b, double[][] x, int numberOfObservations) {
        int numberOfVariables = b.length;
        double[][] jc = new double[numberOfObservations][numberOfVariables];
    
        for (int i = 0; i < numberOfObservations; i++) {
            for (int j = 0; j < numberOfVariables; j++) {
                jc[i][j] = derivative(x[i][0], b, j);
            }
        }
        return jc;
    }
    

    Function derivative() is called to calculate the partial derivative:

    public double derivative(double x, double[] b, int bIndex) {
        double[] bCopy = b.clone();
        bCopy[bIndex] += alpha;
        double y1 = findY(x, bCopy);
        bCopy = b.clone();
        bCopy[bIndex] -= alpha;
        double y2 = findY(x, bCopy);
        return (y1 - y2) / (2 * alpha);
    }
    

    This function gives a good approximate of the partial derivative; i.e., the change in y after a small change in the variable.

  4. Perform (J JT)-1 JT r operations:
    public double[][] transjacob(double[][] JArray, double[][] res) throws NoSquareException {
        Matrix r = new Matrix(res); // r
        Matrix J = new Matrix(JArray); // J
        Matrix JT = MatrixMathematics.transpose(J); // JT
        Matrix JTJ = MatrixMathematics.multiply(JT, J); // JT * J
        Matrix JTJ_1 = MatrixMathematics.inverse(JTJ); // (JT * J)^-1
        Matrix JTJ_1JT = MatrixMathematics.multiply(JTJ_1, JT); // (JT * J)^-1 * JT
        Matrix JTJ_1JTr = MatrixMathematics.multiply(JTJ_1JT, r); // (JT * J)^-1 * JT * r
        return JTJ_1JTr.getValues();
    }
    
  5. Using the results of step 4, the new values of the parameters are calculated:
    IntStream.range(0, values.length).forEach(j -> b2[j] = b2[j] - gamma * values[j][0]);

    b2 is the new b matrix. gamma is a fraction of the values coming from Jacobian. If the initial values for b are far from optimum, there is a possibility of convergence problem. Applying this simple fraction seems to solve this issue. The downside of applying this fraction is the number of iterations which will increase.

  6. Having the new b matrix, repeat steps 2-5 in the next iteration. All optimisation steps are given in the function below:
    public double[] optimise(double[][] x, double[] y, double[] b) throws NoSquareException {
            int maxIteration = 1000;
            double oldError = 100;
            double precision = 1e-6;
            double[] b2 = b.clone();
            double gamma = .01;
            for (int i = 0; i < maxIteration; i++) {
                double[][] res = calculateResiduals(x, y, b2);
                double error = calculateError(res);
                System.out.println("Iteration : " + i + ", Error-diff: " + 
                        (Math.abs(oldError - error)) + ", b = "+ Arrays.toString(b2));
                if (Math.abs(oldError - error) <= precision) {
                    break;
                } 
                oldError = error;
                double[][] jacobs = jacob(b2, x, y.length);
                double[][] values = transjacob(jacobs, res);
                IntStream.range(0, values.length).forEach(j -> b2[j] = b2[j] - gamma * values[j][0]);
            }
            return b2;
    
        }

Example

Function

Observations

x0.0380.1940.4250.6261.2532.5003.740
y0.0500.1270.0940.21220.27290.26650.3317

Target

Find a1 and a2 using the above data and Gauss-Newton algorithm:

@Test
public void optimiseWithInitialValueOf1() throws NoSquareException {
    double[][] x = new double[7][1];
    x[0][0] = 0.038;
    x[1][0] = 0.194;
    x[2][0] = 0.425;
    x[3][0] = 0.626;
    x[4][0] = 1.253;
    x[5][0] = 2.500;
    x[6][0] = 3.740;
    double[] y = new double[] { 0.050, 0.127, 0.094, 0.2122, 0.2729, 0.2665, 0.3317 };
    GaussNewton gaussNewton = new GaussNewton() {

        @Override
        public double findY(double x, double[] b) {
            // y = (x * a1) / (a2 + x)
            return (x * b[0]) / (b[1] + x);
        }
    };
    double[] b = gaussNewton.optimise(x, y, 2);
    Assert.assertArrayEquals(new double[]{0.36, 0.56}, b, 0.01);
}

In the above test, the initial values for the parameters are the default value (1.0); however, with starting values of 100, we will get the same values; see the attached codes. I have also tested the algorithm on a large set of randomly generated data and it works with good time and memory efficiency; see the test codes attached.

History

This is the first version of this article.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Ata Amini
Software Developer (Senior) Private
United Kingdom United Kingdom
I have a PhD in computational chemistry from Newcastle University. I worked for Imperial College London as research scientist for 6.5 years followed by 7 years in banking in the City of London as senior software developer. Currently I do mathematical modelling and software development for a private company and spend some time in research and development in the University of Newcastle.

You may also be interested in...

Comments and Discussions

 
QuestionTypo (typing mistake) Pin
Quinzio29-Mar-17 21:30
memberQuinzio29-Mar-17 21:30 
QuestionWould you formulate it better in the abstract? Pin
Sergey Alexandrovich Kryukov13-Mar-17 11:14
memberSergey Alexandrovich Kryukov13-Mar-17 11:14 
AnswerRe: Would you formulate it better in the abstract? Pin
Ata Amini13-Mar-17 13:16
memberAta Amini13-Mar-17 13:16 
GeneralRe: Would you formulate it better in the abstract? Pin
Sergey Alexandrovich Kryukov13-Mar-17 14:48
memberSergey Alexandrovich Kryukov13-Mar-17 14:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web01-2016 | 2.8.180920.1 | Last Updated 13 Mar 2017
Article Copyright 2017 by Ata Amini
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid