Efficient Matrix Programming in C#

Emanuele Ruffaldi

4.89/5 (44 votes)

Dec 12, 2002

4 min read

459930

10949

Fast matrix expressions evaluation, based on dynamic code generation and partial evaluation

Introduction

This article shows a technique to write clear and efficient matrix math code in C# language. The matrix operations are expressed using operator redefinition, but the code is generated dynamically, following the principle of partial evaluation. This gives a solution efficient and elegant as the one obtained by the Expression Templates in C++. This article is the prosecution of my studies in dynamic code generation and operator overloading with C# (here).

The problem

An object oriented language as C#, lets you define very elegant matrix classes, that encapsulate element access, and with the mechanism of operator overloading you can write easily:

Matrix m = new Matrix(10,5);
Matrix n = new Matrix(10,5);
Matrix r;
double d = 2;	
	
r = m + n;
r = n*2;

The classic implementation of these classes is to create a temporary Matrix object for each sub expression, and the problem is much more evident in the following code where we generate two temporary objects.

Matrix a,b,c,r; // with m and n initialized

r = a+3*(b+c);

This approach introduces a big overhead in object based matrix code, and usually the math programmers use arrays and indices losing all the object orientation. This kind of code can be called C#Tran (from C#/Fortran) like the JavaTran coined by Veldhuizen.

double [] a, b, c, r;

for(int i = 0; i < a.length; i++)
    r[i] = a[i]+3*(b[i]+c[i]);

Solutions

A possible solution to this problem is the defered evaluation: when writing b+c we generate an object MatrixAdd that incapuslates the operation. It has overloaded operators too and we can write 3*(b+c) generating a MatrixAddMul object that evaluates the operation efficiently. This time the overhead is relative to all possible combinations of operations and the fact that if our specific expression is not covered by a deferred operation, we revert to the inefficient temporary based code.

My proposal is to use a generative programming approach: generate specialized code for specialized operations. I use the dynamic code capabilities of the .NET framework to generate on the fly the for-based code directly in IL. A compiled expression can be evaluated one time or stored for reuse into a delegate of type Evaluator.

These are examples of usage:

Vector a = new Vector(3);
Vector b = new Vector(3);
Vector r = new Vector(3);
	
(a+b*3).AssignTo(r)();

The IL code generated is equivalent to the following. You should notice that the n size is known at the generation moment, so the for loop has a constant value: this is the partial evaluation, because the code is specialized to specific values and the JIT can generate optimized code.

int n = a.rows();
for(int i = 0; i < <<n>>; i++)
{
    r[i] = a[i]+b[i]*3;
}

This is another example with Matrix objects:

Matrix a, b, c, r;
	
// evaluate directly
(a + Expr.Cos(b*c*3)).AssignTo(r)();

// store and reuse the compiled expression ...
Evaluator ev = (a + Expr.Cos(b*c*3)).AssignTo(r);
	
// and then evaluate
ev();

This is the equivalent code:

int n = a.rows();
int m = a.cols();

for(int i = 0; i < <<n>>; i++)
    for (int j = 0; j < <<m>>; j++)
    {
        double d = 0;
        for(int k = 0; k < <<q>>; k++)
            d += b[i,k]*q[k,j];
        r[i,j] = a[i,j]+cos(d*3);
    }

The implementation

Internally each expression is stored as a tree of objects derived from the class Expr, and assigned to an object derived from LeftExpr. We can evaluate the expression slowly, or generate dynamically the IL code, that evaluates the expression. We consider operations with vectors and two dimensional matrices but the system can be extended to more dimensions. Before generating the code we check the sizes of the matrices involved in the operations, throwing the exception SizeMismatchException when they are not correct.

An expression of class Expr has the following virtual methods:

// evaluates the expression against the two indices
float Eval(int i, int j)	
// compile the expression using the context
void  Compile(ILGenerator g, CompilerContext cc)
// a property to find the size of the expression
OpSize Size { get }

The LeftExpr is a specialized Expr that handles assignments:

// assign a value v to the Expression
void Assign(int i,  int j, float v);
// compile the assignment (two phase compilation)
void CompileAssign(ILGenerator g, CompilerContext cc, bool post);

The hierarchy of the classes that I've defined is shown in the following image:

The class generated dynamically is more complex than the one described in the previous article, this time we have to store all the array objects, the parameter values, and the indices i, j. To analyze the generated code, set the SafeMode variable in the Compiler class before generating the first Expression, and then, at the end, generate the assembly with Compiler.Save().

SubMatrix Access

One of the features obtained with this system is the fast access to submatrices. You can evaluate expression relative to parts of a matrix, like rows, columns or the main diagonal in case of a square matrix. The access to the different parts of the matrix is efficient because of the dynamic compilation. I use also an easy to read notation (I think), based upon the Tag Class approach.

The code is now much clearer than the standard array manipulation. The following code shows some submatrix accesses and their MATLAB equivalent (remember that MATLAB uses 1-based indexing).

Matrix m;

// a row, like MATLAB m(3,:)
m[2][Matrix.All]
	
// a range of rows like m(3:4, :)
m[Matrix.Range(2,3)][Matrix.All]
	
// a column, like m(:,5)
m[Matrix.All][4]
	
// columns, like m(:,3:6)
m[Matrix.Range(2,5)[Matrix.All]
	
// a submatrix m(2:3,4:5)
m[Matrix.Range(1,2)][Matrix.Range(3,4)]
	
// a single element m(3,4)
m[2][3]
	
// the third element taking the matrix as a row vector
// there is no MATLAB equivalent because MATLAB 
//uses column ordered matrices
m[2]
	
// some elements taking the matrix as a row vector
m[Matrix.Range(2,3)]
	
// the diagonal
m.Diagonal()

These operations are based on MatrixRowRef structures that stores the first array access and generates a SubMatrix object. It's a LeftExpr derived class so can be used in the assignments.

Vector v = new Vector(3);
Matrix m = new Matrix(3,5);
(v + 22).AssignTo(m[Matrix.All][2])();

Speed Considerations

The code generated by this system is as efficient as the for based code giving high speed math evaluation, but there is a little overhead during the generation phase, so this approach is effective when the expression is evaluated many times. Specifically the first expression dynamically evaluated is a little slower than the others because of the creation of the dynamic Assembly.

Conclusion

The dynamic code generation of .NET, combined with the C# custom operators, let us build elegant and efficient math expressions, that let use C# as a Math Language. Citing Jim Blinn, "we have to put as much intelligence in the class libraries as possible, to make things as easy as possible for users of those classes". The final user of this code should just know that it's fast and easy to use.

References

Expression Templates, Todd Veldhuizen, 1994 (the original article that uses C++ templates)
Expression Templates in Java, Todd Veldhuizen (a possibile solution for Java that uses a full Java Compiler)
Optimizing C++ Vector Expressions, Jim Blinn, 2000, IEEE Cg (a good explaination by the great Blinn)