F# Versus Mathematics, Part Two: Practical BLAS and LAPACK with .NET

Dan Buskirk

4.80/5 (2 votes)

Apr 24, 2011

CPOL

9 min read

17898

239

Using the standard BLAS and LAPACK libraries from F#

Download source - 4.02 MB

Introduction

The applications of linear algebra are stunning in both depth and breadth. However, three applications, not without overlap, appear over and over again.

Solution of Linear Systems
Linear Least Squares Estimation
Eigenvalue and Eigenvector Calculation

In Part Two, we will look at how the .NET developer can use BLAS and LAPACK to solve linear systems and in doing so will learn more about the organization of BLAS and LAPACK in general.

Background

In Part One, we saw that BLAS contains the fundamental workhorse routines of linear algebra. In part two, we will expand this just a little with the addition of routines to handle matrices of complex numbers. While these are common in science and engineering, economists generally deal with data that is real (or so they would like to believe.)

Our main focus for Part Two, however, is to get an idea of the organization of LAPACK. We mentioned in Part One that LAPACK sits at a higher level and calls BLAS as needed. LAPACK functions can be organized into three groups:

Drivers
Computational Routines
Auxiliary Routines

As you know, a problem like the solution of a system of linear equations requires multiple steps. The "Drivers" are responsible for accepting the necessary arguments; drivers then dole out tasks to the individual computational routines required for the steps. Auxiliary routines are provided for some common utility tasks. You could, in principal, never use a driver. Your code would simply have to call each of the necessary computational routines itself. And call them correctly, I might add.

As you gain experience with LAPACK, you will notice that some drivers are marked as "expert". These drivers make more sophisticated decisions about the computational processing required by their tasks.

Using the Code

We shall extend the fundamental .NET library organization we begin in Part One. That is, for each BLAS/LAPACK function, we declare the native function call in LapackAPI.fs, we weld the managed matrices to the native in a file sharing the name of the underlying BLAS/LAPACK function, and then we add a function for the interface to the .NET world. One in fsLAPACK for the use of F# developers, and another in clsLapack for developers using any of the CLS languages.

Adding Complex Matrices

In the lingo of BLAS/LAPACK, functions beginning with "C" operate on complex single floating-point numeric types, and functions beginning with "Z" operate on the corresponding double floating-point type. We will add ZGEMM and ZGEMV, the "Z" complex "GE" general "MM" matrix-matrix multiply, and "MV" her matrix-vector multiply sister.

This is almost trivally easy, since we will pretty much just be substituting the word "complex" for the word "double" in our F# code. Remember, however, that this is the complex type defined in the FSharp.Math namespace, and later we shall have to consider the options available to C# and VB programmers who will use our library. The .NET Framework added complex number support only with .NET 4.0. Developers using earlier versions had to construct "roll-your-own" complex solutions.

I have been unable to find a way to coax F# into accepting an CLS array with elements of type System.Numerics.Complex, though it seems like it should be feasible. I therefore fell back on the "brute force" method of copying the array data into an F# complex array. Ugly, slow, but at least it provides a working solution for the time being. It does, however, provide the opportunity to illustrate a very nice static method of the Array2D class, Array2D.iteri. The following code copies an input array with elements of type System.Numerics.Complex into an Array2D of F# complex elements.

Array2D.iteri (fun i j _ -> (A.[i,j]<- complex (A_In.[i,j]).Real 
(A_In.[i,j]).Imaginary)) A_In

The F# Array classes have an "iter" method which applies a function to each element of an array. The iteri method is similar, but as it iterates through the array provided as the second argument the indices i and j are provided as arguments for the function. Note the underscore, indicating the potential for using a third argument.

A Note About the Test Scripts

In Part One, there were only two functions to test. Now there are lots. When we load, say, fsLapack.fs into the interactive environment for testing, we must include every file on which everything in fsLapack depends, even if we only wish to test a single function. And we have to do this for every test script. In other words, if we add a new function and want to create a new test script, we would need to modify all the previous test scripts as well. For this reason, all of the #load statements and "open" statements have been placed in a separate script file called includes.fsx. This is run by the fsi batch files prior to running the particular test script of interest. This way, every time we create a new file, we load and open it in the includes files, and we are done. In Don Symes' book, he says you can invoke a script file like includes.fsx from another script file, but I have not been able to get this to work. I settle for loading two in the batch file which starts fsi.

As in Part One, each test script is matched with a batch file that starts it in fsi.

Linear Systems

For our solution of systems of linear equations, we will once again stick to the "general" case. The Powerpack unit test code provides an example, which is retained in essence in the attached code as fsLinearSystemSolve. This function uses DGETRF and DGETRS to accomplish its task. DGETRS solves the linear system after receiving the LU factors determined by DGETRF. This is valuable for illustrative purposes, but LAPACK has a driver routine, DGESV, to solve systems of linear equations with real coefficients with a single function call. The function fsLinearSystemSolve in the source code uses this technique, and is essentially the same as the example in the Powerpack code. The following code uses the DGESV driver routine:

 let fsDGESV (A: matrix) (B: vector)  = 
        // we are sticking with the simplest case, B as a vector
        // fortunately, this is also a very common case
        let ipiv = Array.zeroCreate A.NumCols
        let pA = pinM A.Transpose 
        // note: fortran will change the transposed matrix, not A
        // we will not, therefore have access to the factorization
        let pB = pinV B
        let pPivots = pinA ipiv
        let info = dgesv.DGESV_C pA.NativeArray pB.NativeArray pPivots.NativeArray 
        freeM pA; freeV pB; freeA pPivots
        (info, B)

In DGESV, info has a value greater than zero if the diagonal value of the factorization, U(i,i) is exactly zero. In this case, the matrix is exactly singular and the computation cannot proceed. You should note, however, that in the world of floating-point representations "exactly zero" doesn't happen all that often. A matrix can be mathematically singular, but round-off error prevents any diagonal element from being exactly zero. Close doesn't count for the info parameter.

let DGESV(A : FortranMatrix<double>) (B : FortranMatrix<double>) 
(ipiv : NativeArray<int>) = 
let mutable n = A.NumRows
let mutable nhrs = B.NumCols // should always be one in our work
let mutable lda = A.NumCols
let mutable info = 0
let mutable ldb = B.NumRows 
LAPACK.LapackAPI.DGESV(&&n, &&nhrs, A.Ptr, &&lda, ipiv.Ptr, B.Ptr, &&ldb, &&info);
info

let DGESV_C (A : CMatrix<double>) (B : NativeArray<double>) (ipiv : NativeArray<int>) = 
    // right now we are sticking with vectors on the RHS
    Debug.Assert(A.NumCols = B.Length);
    Debug.Assert(ipiv.Length = A.NumRows);
    DGESV A.NativeTranspose (NativeUtilities.nativeArray_as_FortranMatrix_colvec B) ipiv

Running the Code

The batch file RunTestsLinearSystems finds the solution to a simple set of three equations in three unknowns. After running the batch file, you will notice that the function returns the correct tuple (i.e. info and result vector), but that the input matrix and vector have changed. Indeed, you should observe that the input vector is the same as the function output. Almost invariably, the underlying Fortran code returns its results in the same arrays used for input, which is generally not the .NET way. If you do not want this, you must copy the input arrays. The code presented here does not copy arrays except when necessary, since this can be time- and resource-consuming for large arrays and the decision to copy should be left to the developer using the library. The best way to handle this might be a focus for refactoring at some future date.

Viewing an LU Factorization

As we have just mentioned, when using DGESV, the LU factorization is done behind the scenes and we do not have access to the results. If we wish to see the factorization, perhaps just for fun, we can.

Here we discover another small flaw in the Powerpack unit test code. DGETRF returns an info greater than one if one of the diagonal elements in the factorization is exactly zero. The Powerpack code throws an exception in this case, which is appropriate if one is using an LU factorization on a square matrix to solve a system of linear equations. There are other reasons, however, to obtain an LU factorization. It is perfectly valid to obtain LU factors for a singular matrix or a rectangular matrix. For a rectangular matrix, it is virtually certain that some diagonal element will be zero. This is not an error. I have commented out this error and returned unit, but this code should be refactored in the future. In the code presented, we will be limited to cases where the rectangular matrix has more columns than rows. We do this for an important simplification; we want to ensure that the L of LU always be square.

How to interpret the results of the LU factorization by DGETRF is not immediately evident. L and U output are returned added together in the input matrix A. Assuming more columns than rows, the diagonal elements of L will always be one and are not represented at all. So to obtain L and U, we must break A into two pieces. L is obtained by creating a square matrix, taking the elements beneath the diagonal of A and then setting the diagonal elements to one. The remaining elements are of course zero. U is read as everything from the diagonal up. Functions fsExtractL and fsExtractU have been provided in the attached code to split a result matrix. The product LU yields a matrix equivalent to the input matrix, but rows may have switched positions. The job of the pivot array is to be permit us to easily reconstruct the original matrix A. The pivot array is one-dimensional, but specifies which rows have been interchanged during the pivot operations. This information is used to construct a permutation matrix representing the row exchange.

For this purpose, we have created a helper function in fsLapack.fs. We simply start with an identity matrix, and actually perform the row operations specified in the pivot array received from Fortran. This is easy, since each row only has a single 1 in it.

let fsPermutationFromPivots (ipivot: int[]) = 
let temp = Matrix.toArray2D (Matrix.identity ipivot.Length)
let pivot_aux i j = 
   if (j = ipivot.[i]-1 && i <> ipivot.[j]-1) then 
      temp.[i,j] <- 1.0
      temp.[j,i] <- 1.0
      temp.[i,i] <- 0.0
      temp.[j,j] <- 0.0

Array2D.iteri (fun i j _ -> (pivot_aux i j)) temp 
Matrix.ofArray2D temp

The factorization can now be tested by obtaining the original matrix from PLU. Of course, matrix multiplication is defined from right to left, so in F# the multiplication would be P*(L*U).

Coming Soon to an F# Compiler Near You...

In the next installment, we will look at the calculation of eigenvalues and eigenvectors for general matrices.

History

24^th April, 2011: Initial post