Click here to Skip to main content
11,703,940 members (62,444 online)
Click here to Skip to main content

Yet Another Math Parser (YAMP)

, 20 Sep 2012 CPOL 45.2K 2.1K 87
Rate this:
Please Sign up or sign in to vote.
Constructing a fast math parser using Reflection to do numerics like Matlab.
This is an old version of the currently published article.



In some projects we might need a proper math parser to either do simple math (a little calculator) or more complex one. CodeProject offers us a variaty of math parsers ranging from ports of C/C++ ones to ones purely written in C#. This article tries to create a powerful parser in pure C#, using the properties that come with C# and the .NET Framework. The parser itself is perfectly capable of running on other platforms as well, as it has been written on Linux.

The parser itself is Open Source. The full source code of YAMP is available on Github. A NuGet package will also be available soon.

Personal background

Years ago I've written a math parser using the C# compiler and some inheritance. The way was to provide a base class that offered all possible methods (like sin(), exp(), ...), which serves us in the inheritance process. Then a string has been built that represented valid C# code with a custom expression in its middle, which has then to be compiled. Of course a pre-parser was necessary to give some data types. On the other side operator overloading and other C# stuff could then be used.

The problem with this solution is that the C# compiler is kind of heavy for a single document of code. Also the resulting assembly had to be loaded first. So after our pre-parsing process we had to deal with the C# compiler and the assembly. This means that reflection was necessary as well. The minimum time for a any expression was therefore O(102) ms. This is far too long and does not provide a smooth outcome.

There are several projects on the CodeProject, which aim to provide a powerful parser. All of those projects are faster than my old solution - they are also more robust. However, they are not as complete as my project was - therefore I now try to succeed in this very old project with a new way. This time I built a kind of parser from scratch, which uses (and abused) reflection and therefore provides an easy to extend code base, that contains everything from operators to expressions over constants and functions.

What YAMP can do

Before we dig to deep into YAMP we might have a look on what YAMP offers us. First a brief look at some features:

  • Assignment operators (=, +=, -=, ...)
  • Complex arithmetic using i as imaginary constant.
  • Matrix algebra including products, divisions.
  • Trigonometric functions (sin, cos, sinh, cosh) and their inverse (arcsin, arccos, arcosh, arsinh).
  • More complicated functions like Gamma and others.
  • Faculty, transpose and left divide operator.
  • Using indices to get and set fields.
  • Complex roots and powers.

The core element of YAMP is the ScalarValue class, which is an implementation of the abstract Value class. This class provides everything that is required to do numerics with a complex type. Implementations of the basic trigonometric functions with complex arguments have been included as well.

The MatrixValue class is basically a (two dimensional) list of ScalarValue instances. We might think that it should be possible to make this list n dimensional, however, that would offer some complications like the need to specify a way to set a value in the n dimension. At the moment we will avoid this issue.

Now that we have a brief overview of the main features of YAMP, let's look at some of the expressions that can be parsed by YAMP:


This is a simple expression, but it already expressions that are not trivial. In order to parse this expression successfully, we need to be careful about operator levels (the multiply operator * has a higher level than the one for subtraction -, same goes for the division operator /). We also have the minus sign used as a unitary operator, i.e. we do not have a heading zero here.


This is often parsed wrong. Asking Matlab for this expression we get 256. That seemed kind of low and wrong. Asking YAMP first resulted in this as well - until we include the rule that even operators of the same level will be treated separately - from right to left. This rule does not make a difference in any expression - except with the power operator. Now we get the same result as Google shows us - 65356.


We could use brackets, but in this case we do not need them. This is a simple matrix assignment to the symbol x. Our own assigned symbols are case sensitive, while included functions and constants are not. Therefore we could assign our own pi and still access the original constant by using a different upper- / lowercase notation like Pi.


Again, some constants are built in like phi, e and pi. Therefore this expression is parsed successfully. Using YAMP as an external library it is easily possible to add our own constants.


This seems odd, but it will make sense. First of all we have a scalar (2) powered by a matrix (the one we assigned before). This is not defined, therefore YAMP will perform the power operation on every value in the matrix. The resulting matrix will have the same dimensions as the exponent - it will be a 2 times 3 matrix. This matrix then performs the subtraction with 5 (a scalar) on the right side. This is again not defined, therefore YAMP performs the operation on every cell again. In the end we have a 2 times 3 matrix.


We use the range operator : to specify all available indices. The row index is set to 1 (we use rows before columns as Matlab and Fortran does - and we are now 1 based) - the first row. Therefore the result will be a matrix with dimension 1 times 3 (a transposed three vector). The entries will be 2, 3, 4.


Here we reassign every value of the matrix to a new value - given in the vector 8, 9, 10, 11, 12, 13. If we just use one index it will be a combined one. Therefore a 2 times 3 matrix might have the pair 2,3 as last entry (and the pair 1,1 as first entry), but might also have the index 6 as last entry (and the index 1 as first entry).


The adjungate operator either conjugates scalars (a projection on the real axis, i.e. changing the sign of the imaginary value) or transposes the matrix along with conjugating its entries. Therefore xt now stores a 3 times 2 matrix. If we just want to transpose (without conjugating) we can use the .' operator.

z=sin(pi * (a = eye(4)))

Here we assign the value of eye(4) to the symbol a. Afterwards the value is multiplied with pi and then evaluated by the sin() function. The outcome is saved in the symbol z. The eye() function generates an n dimensional unit matrix. If we apply trigonometric functions on matrices we end up (as with the power function and others) with a matrix, where each value is the outcome of the former value used as an argument for the specific function.


Here two things are happening. First we create a new symbol (called whatever) and then we assign the cell in the 10th row and 10th column to the value 8+3i. So we are able to use indices even on non-existing symbols. We can also expand matrices by assigning higher indices. This is only possible by setting values - we are not available to get higher indices. Here an exception will be thrown.


Here we will get information about all even rows (2:2:end means: start at index 2 and go to the end (a short macro) with a step of 2) with the first half of their columns. The displayed matrix will be a 5 times 5 one with rows 2,4,6,8,10 and columns 1,2,3,4,5.

YAMP does take matrices seriously. Multiplying matrices is done with respect to the proper dimensions. Let's have a look at a sample console application:

Output in the sample console application

The index operator allows us to select (multiple) columns and / or rows from matrices. The following output shows the selection of the inner 3x3 matrix (from the 5x5 matrix).

Output in the sample console application

This concludes our short look at YAMP. Of course YAMP is also able to perform queries like i^i or (2+3i)^(7-i) or 17! correctly.

How YAMP does it

YAMP contains a few important classes:

  • ParseTree
  • Value
  • AbstractExpression
  • Operator
  • IFunction

The most important datatype for accessing YAMP outside the library is the Parser class. We'll introduce this concept later on.

Right now we want to have a short look at the class diagram for the whole library. First of all the class diagram in principle:

Short class diagram of YAMP

The Parser generates a ParseTree, that has full access to the singleton class Tokens. The singleton is used to find operators, functions, expressions and constants, as well as resolving symbols.

YAMP relies heavily on reflection. The first time YAMP is used might be a little bit slow (compared to further usages). Therefore calling the static Load() method of the Parser should be done before any measurements or user input. This guarantees the shortest possible execution time for the given expression.

Any expression will always consist of a number of sub-expressions and operators. Operators can either be binary (like +, -, ...) or unary (like !, [], ', ...). The binary ones need two operands (left and right), while unary ones just need one. An expression might be one of the following:

  • A bracket expression, which contains a full ParseTree again.
  • A number - it can either be positive, negative, real or complex.
  • An absolute value, which is sometime like a bracket expression that calls the abs() function after being evaluated.
  • A symbolic value - either to be resolved (could be a constant or something that has been stored previously) or to be set.
  • A function expression can be viewed as a symbolic value that has a bracket expression directly attached (without an operator between the expression).

This results in the following class diagram:

Expression class diagram

Let's have a look at the class diagram for the operators next:

Operator class diagram

Here we see that the AssignmentOperator is also nothing more than another BinaryOperator. For simplicity and code-reuse another abstract class called AssignmentPrefixOperator has been added as an additional layer between any combined assignment operator (like += or -=) and the original assignment operator. Overall the following assignment operators have been added: +=, -=, *=, /=, \=, ^=.

The difference between the left division (/) and right division (\) is that, while left division means A = B / C = B-1 * C, right division means A = B \ C = B * C-1. The difference between both can be crucial for matrices, where it matters, if we multiply from left or right. The two operands are always of type Value. This type is an abstract class, which forms the basis for the following derived datatypes:

  • ScalarValue for numbers (can be imaginary)
  • MatrixValue for matrices (can be only vectors), which consist of ScalarValue entries
  • StringValue for strings
  • ArgumentsValue for a list of arguments

The functions are also a quite important part of YAMP. The class diagram for the functions look like:

Function class diagram

Here we create a standard type StandardFunction. This type has to implement the interface IFunction, as required. The only thing that will be called is the Perform() method. The StandardFunction contains already a framework that works only with numeric types like MatrixValue and ScalarValue. If the Perform() method won't be changed, every number of a given matrix will be changed to the result of the function that is being called.

Another useful type here is the ArgumentFunction type. This one can be used to create a fully operational overload machine. Once we derive from this class we only have to create one or more functions that are called Function() with return type Value. Reflection will find out how many arguments are required for each function and call the right one (or return an error to the user) for the number of given arguments. Let's consider the RandFunction:

class RandFunction : ArgumentFunction
	static readonly Random ran = new Random();
	public Value Function()
		return new ScalarValue(ran.NextDouble());
	public Value Function(ScalarValue dim)
		var k = (int)dim.Value;
		if(k <= 1)
			return new ScalarValue(ran.NextDouble());
		var m = new MatrixValue(k, k);
		for(var i = 1; i <= k; i++)
			for(var j = 1; j <= k; j++)
				m[j, i] = new ScalarValue(ran.NextDouble());
		return m;
	public Value Function(ScalarValue rows, ScalarValue cols)
		var k = (int)rows.Value;
		var l = (int)cols.Value;
		var m = new MatrixValue(k, l);
		for(var i = 1; i <= l; i++)
			for(var j = 1; j <= k; j++)
				m[j, i] = new ScalarValue(ran.NextDouble());
		return m;

With this code YAMP will return results for zero, one or two arguments. If the user provides more than two arguments, YAMP will throw an exception with the message that no overload has been found for the number of arguments the user provided.

Implementing new standard functions, i.e., those functions which just have one argument, is easy as well. Here the implementation of the ceil() function is presented:

class CeilFunction : StandardFunction
    protected override ScalarValue GetValue(ScalarValue value)
        var re = Math.Ceiling(value.Value);
        var im = Math.Ceiling(value.ImaginaryValue);
        return new ScalarValue(re, im);

The implementation of the BinaryOperator class is quite straight forward. Since any binary operator requires two expressions the implementation of the Evaluate() method throws an exception if the given number of expressions is different from two. Two new methods have been introduced: One that can be overridden called Handle() and another one that has to be implemented called Perform(). The previous one will be used by the standard implementation of the first one.

abstract class BinaryOperator : Operator
	public BinaryOperator (string op, int level) : base(op, level)
	public abstract Value Perform(Value left, Value right);

    public virtual Value Handle(Expression left, Expression right, Hashtable symbols)
        var l = left.Interpret(symbols);
        var r = right.Interpret(symbols);
        return Perform(l, r);
	public override Value Evaluate (Expression[] expressions, Hashtable symbols)
		if(expressions.Length != 2)
			throw new ArgumentsException(Op, expressions.Length);
		return Handle(expressions[0], expressions[1], symbols);

Let's discuss now how YAMP is actually generated the available objects. Most of the magic is done in the RegisterTokens() method of the Tokens class.

void RegisterTokens()
	var assembly = Assembly.GetExecutingAssembly();
	var types = assembly.GetTypes();
	var ir = typeof(IRegisterToken).Name;
	var fu = typeof(IFunction).Name;
	foreach(var type in types)
        if (type.IsAbstract)

		if(type.GetInterface(ir) != null)
			(type.GetConstructor(Type.EmptyTypes).Invoke(null) as IRegisterToken).RegisterToken();
		if(type.GetInterface(fu) != null)
			AddFunction(type.Name.Replace("Function", string.Empty), (type.GetConstructor(Type.EmptyTypes).Invoke(null) as IFunction).Perform, false);

The code goes over all types that are available in the executing assembly (YAMP). If the type is abstract, we avoid it. Otherwise we look if the type implements the IRegisterToken interface. If this is the case we will call the constructor, treat it as a the interface and call the RegisterToken() method. This method usually looks like the following (for operators):

public void RegisterToken ()
	Tokens.Instance.AddOperator(_op, this);

Now we have the case that an operator is searched from our ParseTree. In this case the FindOperator() method of the Tokens class will be called.

public Operator FindOperator(string input)
	var maxop = string.Empty;
	var notfound = true;

	foreach(var op in operators.Keys)
		if(op.Length <= maxop.Length)

		notfound = false;

		for(var i = 0; i < op.Length; i++)
			if(notfound = (input[i] != op[i]))

		if(notfound == false)
			maxop = op;

	if(maxop.Length == 0)
		throw new ParseException(input);

	return operators[maxop].Create();

This method will find the maximum operator for the current input. If no operator was found (the length of the operator with the maximum length is still zero) an exception is thrown. Otherwise the found operator will call the Create() method. This method just returns another instance of the operator.

Parsing a (simple?) expression

In the following paragraphs we are going to figure out how YAMP is parsing a simple expression. The query is given in form of

3 - (2-5 * 3++(2 +- 9)/7)^2.

We actually included some whitespaces, as well as some (obvious) mistakes in the expression. First of all "++" should just be expressed as "+". Second the operation "+-" should be replaced by just "-". First of all the problem is divided into two specific tasks:

  1. Generate the expression tree, i.e. seperate operators from expressions and mind the operator levels.
  2. Interpret the expression tree, i.e. evaluate the expressions by using the operators.

The parser then ends up with an expression tree that looks like the following image.

The expression tree after the invokation of the parser

In the second step we call the method to start the interpretation of the elements. Therefore the interpreter works and calls the Interpret() method on each bracket. Every bracket looks at the number of operators. If no operator is available, then there has to be only one expression. The value of the expression is then returned. Otherwise the Handle() method of the operator is called, passing the array of Expression instances. Since the operator is either a specialized BinaryOperator or a specialized UnaryOperator, the method will always call the right function with the right number of arguments.

The value tree after the invokation of the interpreter

In this example our expression tree consists of five layers. Even though the interpretation starts at the top level (1), it requires the information of the lowest level (5) to be executed. Once 5 has been interpreted, level 4 can be interpreted, then level 3 and so on. The final result of the given expression is -193.

YAMP against other math parsers

We can guess that YAMP is probably slower than most parser written in C/C++. This is just the native advantage (or managed disadvantage) we have to play with. Therefore benchmarks against parser written in C++ may be unfair. Since Marshalling / Interop takes some time as well, parsing small queries and comparing the result might also be unfair (this time probably unfair for the C/C++ parser). Finally a fair test might therefore only consist of C# only parser. Having a look at CodeProject we can find the following parsers:

We set up the following four tests:

  • Parse and interpret 100000 randomly generated queries*
  • Parse and interpret 10000 times one pre-defined query (long with brackets)
  • Parse and interpret 10000 times one pre-defined query (medium size)
  • Parse and interpret 10000 times one pre-defined query (small)

* The expression random generator works like this: First we generate a number of (binary) operators, then we alternate between expression and binary operator, always randomly picking one. The expression is just a randomly picked number (integer) and the operator is a randomly chosen one from the list of operators (+, -, *, /, ^). In the end we have a complete expression with n operators and n+1 numbers.

The problem with this benchmark is that only 3 (YAMP, MathParser.NET, and LL Mathematical Parser) parsers have been able to parse all queries (even excluding the ones using the power operator). Therefore this benchmark was just able to give a speed comparison between these three. However, we can use this test to identify that YAMP is really able to parse a lot of different queries.

If we look at the time that is required per equation in dependency of the number of operators, we see that YAMP scales actually pretty well. MFP and LLMP are significantly better here, however, MFP could not parse much at all and LLMP does not support complex arguments nor does it support matrices.

Time per equation in dependency of the number of operators

The biggest problem with the fastest implementations are the constraints given by them. They are hard to extend and less flexible than YAMP. While YAMP does contain a lot of useful features (indices, assignments, overloaded functions, ...), none of the other parsers could come even close to the level of detail.

Total time for each benchmark

The longer (and more complex) the benchmark, the better for YAMP. This means that YAMP performs solid in the region of short expressions (where no one will notice much about great performance), while it performs great in the region of long expressions (where most would notice delays and other annoying stuff otherwise).

100000 (R)57163728471611xxx
10000 (L)543145081778505444x
10000 (M)39847951675635122147
10000 (S)131338581136519454

In the table shown above we see that YAMP can parse all equations and does perform quite well. Considering the early stage of the development process and the set of features, YAMP is certainly a good choice for any problem involving numerical mathematics.

How to use YAMP in your code

If we use YAMP as an external library we need to hook up every chance we want to do. Luckily for us, this is not a hard task. Consider the case of adding a new constant.

YAMP.Parser.AddCustomConstant("R", 2.53);

This way we can override existing constants. If we remove our custom constant later one, any previously overridden constant will be available again. This way we can also add or remove custom functions. Here our own function needs to have a special kind of signature, providing one argument (of type Value) and returning an instance of type Value. The following example uses a lambda expression to generate a really simple function in one line:

YAMP.Parser.AddCustomFunction("G", v => new YAMP.ScalarValue((v as YAMP.ScalarValue).Value * Math.PI) );

What does it take to actually use YAMP? Let's see some sample first:

    var parser = YAMP.Parser.Parse("2+(2,2;1,2)");
    var result = parser.Execute();
catch (Exception ex) // this is quite important

So everything we need to do is to call the static Parse() method in the Parser class (contained in the YAMP namespace). The method results in a new Parser instance that contains the full ParseTree. If we want to obtain the result of the given expression, we only need to call the Execute() method.

Points of interest

YAMP makes use of reflection to make extending the code as easy as possible. This way adding an operator can be achieved by just adding a class that inherits from the abstract Operator, BinaryOperator or UnaryOperator class or any other operator that is available.

YAMP is impendent of the set culture (the US numbering style has been set as numbering format explicitly). Strings and others are stored with UTF8 encoding. However, symbols can only start with a letter (A-Za-Z) and can then only contain letters and numbers (A-Za-z0-9).

The project is currently under heavy development, which results in a quite fast update cycle (one to two new versions per week at least). This means that you should definitely consider taking a look at the GitHub page (and maybe this article), if you are interested in the project.

Generally YAMP will require more interested people working on it (mostly on some numeric functions). If you think you can contribute, then feel free to do so! Everyone is more than welcome to contribute to this project.


  • v1.0.0 | Initial release | 19.09.2012.
  • v1.0.1 | Added a table with benchmark data | 19.09.2012


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Florian Rappl
Chief Technology Officer
Germany Germany
Florian is from Regensburg, Germany. He started his programming career with Perl. After programming C/C++ for some years he discovered his favorite programming language C#. He did work at Siemens as a programmer until he decided to study Physics. During his studies he worked as an IT consultant for various companies.

Florian is also giving lectures in C#, HTML5 with CSS3 and JavaScript, and other topics. Having graduated from University with a Master's degree in theoretical physics he is currently busy doing his PhD in the field of High Performance Computing.

You may also be interested in...

Comments and Discussions

Discussions posted for the Published version of this article. Posting a message here will take you to the publicly available article in order to continue your conversation in public.
QuestionLooks pretty good but... Pin
Member 1127830619-Dec-14 16:32
memberMember 1127830619-Dec-14 16:32 
AnswerRe: Looks pretty good but... Pin
Florian Rappl20-Dec-14 0:53
mvpFlorian Rappl20-Dec-14 0:53 
GeneralRe: Looks pretty good but... Pin
Member 1127830620-Dec-14 4:17
memberMember 1127830620-Dec-14 4:17 
GeneralRe: Looks pretty good but... Pin
Florian Rappl20-Dec-14 6:47
mvpFlorian Rappl20-Dec-14 6:47 
GeneralRe: Looks pretty good but... Pin
Member 1127830620-Dec-14 15:50
memberMember 1127830620-Dec-14 15:50 
GeneralRe: Looks pretty good but... Pin
Florian Rappl21-Dec-14 0:56
mvpFlorian Rappl21-Dec-14 0:56 
GeneralRe: Looks pretty good but... Pin
Member 1127830621-Dec-14 8:07
memberMember 1127830621-Dec-14 8:07 
GeneralRe: Looks pretty good but... Pin
Florian Rappl21-Dec-14 9:22
mvpFlorian Rappl21-Dec-14 9:22 
GeneralRe: Looks pretty good but... Pin
Member 1127830631-Dec-14 6:37
memberMember 1127830631-Dec-14 6:37 
GeneralRe: Looks pretty good but... Pin
Florian Rappl31-Dec-14 23:07
mvpFlorian Rappl31-Dec-14 23:07 
GeneralRe: Looks pretty good but... Pin
Member 112783061-Jan-15 6:09
memberMember 112783061-Jan-15 6:09 
GeneralRe: Looks pretty good but... Pin
Florian Rappl4-Jan-15 4:39
mvpFlorian Rappl4-Jan-15 4:39 
GeneralRe: Looks pretty good but... Pin
Member 112783065-Jan-15 5:15
memberMember 112783065-Jan-15 5:15 
GeneralRe: Looks pretty good but... Pin
Florian Rappl5-Jan-15 6:30
mvpFlorian Rappl5-Jan-15 6:30 
QuestionWhat is the difference between Yamp Portable and Yamp? + question about value casting Pin
ach12347-Dec-14 23:08
memberach12347-Dec-14 23:08 
AnswerRe: What is the difference between Yamp Portable and Yamp? + question about value casting Pin
ach12347-Dec-14 23:32
memberach12347-Dec-14 23:32 
GeneralRe: What is the difference between Yamp Portable and Yamp? + question about value casting Pin
Florian Rappl8-Dec-14 1:16
mvpFlorian Rappl8-Dec-14 1:16 
GeneralMy Vote 5 Pin
Shemeemsha RA6-Oct-14 19:58
memberShemeemsha RA6-Oct-14 19:58 
GeneralRe: My Vote 5 Pin
Florian Rappl7-Oct-14 10:38
mvpFlorian Rappl7-Oct-14 10:38 
QuestionGet confused with the parser Execute method Pin
ach123422-Jan-14 0:07
memberach123422-Jan-14 0:07 
AnswerRe: Get confused with the parser Execute method Pin
Florian Rappl22-Jan-14 5:32
mvpFlorian Rappl22-Jan-14 5:32 
GeneralRe: Get confused with the parser Execute method Pin
ach123422-Jan-14 6:07
memberach123422-Jan-14 6:07 
GeneralRe: Get confused with the parser Execute method Pin
Florian Rappl22-Jan-14 10:38
mvpFlorian Rappl22-Jan-14 10:38 
QuestionIs it possibile to define array as variable? Pin
ach12345-Dec-13 2:04
memberach12345-Dec-13 2:04 
AnswerRe: Is it possibile to define array as variable? Pin
Florian Rappl5-Dec-13 3:05
mvpFlorian Rappl5-Dec-13 3:05 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.150819.1 | Last Updated 20 Sep 2012
Article Copyright 2012 by Florian Rappl
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid