Click here to Skip to main content
Click here to Skip to main content

Yet Another Math Parser (YAMP)

By , 20 Sep 2012
 
This is an old version of the currently published article.

YAMP Logo

Introduction

In some projects we might need a proper math parser to either do simple math (a little calculator) or more complex one. CodeProject offers us a variaty of math parsers ranging from ports of C/C++ ones to ones purely written in C#. This article tries to create a powerful parser in pure C#, using the properties that come with C# and the .NET Framework. The parser itself is perfectly capable of running on other platforms as well, as it has been written on Linux.

The parser itself is Open Source. The full source code of YAMP is available on Github. A NuGet package will also be available soon.

Personal background

Years ago I've written a math parser using the C# compiler and some inheritance. The way was to provide a base class that offered all possible methods (like sin(), exp(), ...), which serves us in the inheritance process. Then a string has been built that represented valid C# code with a custom expression in its middle, which has then to be compiled. Of course a pre-parser was necessary to give some data types. On the other side operator overloading and other C# stuff could then be used.

The problem with this solution is that the C# compiler is kind of heavy for a single document of code. Also the resulting assembly had to be loaded first. So after our pre-parsing process we had to deal with the C# compiler and the assembly. This means that reflection was necessary as well. The minimum time for a any expression was therefore O(102) ms. This is far too long and does not provide a smooth outcome.

There are several projects on the CodeProject, which aim to provide a powerful parser. All of those projects are faster than my old solution - they are also more robust. However, they are not as complete as my project was - therefore I now try to succeed in this very old project with a new way. This time I built a kind of parser from scratch, which uses (and abused) reflection and therefore provides an easy to extend code base, that contains everything from operators to expressions over constants and functions.

What YAMP can do

Before we dig to deep into YAMP we might have a look on what YAMP offers us. First a brief look at some features:

  • Assignment operators (=, +=, -=, ...)
  • Complex arithmetic using i as imaginary constant.
  • Matrix algebra including products, divisions.
  • Trigonometric functions (sin, cos, sinh, cosh) and their inverse (arcsin, arccos, arcosh, arsinh).
  • More complicated functions like Gamma and others.
  • Faculty, transpose and left divide operator.
  • Using indices to get and set fields.
  • Complex roots and powers.

The core element of YAMP is the ScalarValue class, which is an implementation of the abstract Value class. This class provides everything that is required to do numerics with a complex type. Implementations of the basic trigonometric functions with complex arguments have been included as well.

The MatrixValue class is basically a (two dimensional) list of ScalarValue instances. We might think that it should be possible to make this list n dimensional, however, that would offer some complications like the need to specify a way to set a value in the n dimension. At the moment we will avoid this issue.

Now that we have a brief overview of the main features of YAMP, let's look at some of the expressions that can be parsed by YAMP:

-2+3+5/7-3*2

This is a simple expression, but it already expressions that are not trivial. In order to parse this expression successfully, we need to be careful about operator levels (the multiply operator * has a higher level than the one for subtraction -, same goes for the division operator /). We also have the minus sign used as a unitary operator, i.e. we do not have a heading zero here.

2^2^2^2

This is often parsed wrong. Asking Matlab for this expression we get 256. That seemed kind of low and wrong. Asking YAMP first resulted in this as well - until we include the rule that even operators of the same level will be treated separately - from right to left. This rule does not make a difference in any expression - except with the power operator. Now we get the same result as Google shows us - 65356.

x=2,3,4;5,6,7

We could use brackets, but in this case we do not need them. This is a simple matrix assignment to the symbol x. Our own assigned symbols are case sensitive, while included functions and constants are not. Therefore we could assign our own pi and still access the original constant by using a different upper- / lowercase notation like Pi.

e^pi

Again, some constants are built in like phi, e and pi. Therefore this expression is parsed successfully. Using YAMP as an external library it is easily possible to add our own constants.

2^x-5

This seems odd, but it will make sense. First of all we have a scalar (2) powered by a matrix (the one we assigned before). This is not defined, therefore YAMP will perform the power operation on every value in the matrix. The resulting matrix will have the same dimensions as the exponent - it will be a 2 times 3 matrix. This matrix then performs the subtraction with 5 (a scalar) on the right side. This is again not defined, therefore YAMP performs the operation on every cell again. In the end we have a 2 times 3 matrix.

y=x[1,:]

We use the range operator : to specify all available indices. The row index is set to 1 (we use rows before columns as Matlab and Fortran does - and we are now 1 based) - the first row. Therefore the result will be a matrix with dimension 1 times 3 (a transposed three vector). The entries will be 2, 3, 4.

x[:]=8:13

Here we reassign every value of the matrix to a new value - given in the vector 8, 9, 10, 11, 12, 13. If we just use one index it will be a combined one. Therefore a 2 times 3 matrix might have the pair 2,3 as last entry (and the pair 1,1 as first entry), but might also have the index 6 as last entry (and the index 1 as first entry).

xt=x'

The adjungate operator either conjugates scalars (a projection on the real axis, i.e. changing the sign of the imaginary value) or transposes the matrix along with conjugating its entries. Therefore xt now stores a 3 times 2 matrix. If we just want to transpose (without conjugating) we can use the .' operator.

z=sin(pi * (a = eye(4)))

Here we assign the value of eye(4) to the symbol a. Afterwards the value is multiplied with pi and then evaluated by the sin() function. The outcome is saved in the symbol z. The eye() function generates an n dimensional unit matrix. If we apply trigonometric functions on matrices we end up (as with the power function and others) with a matrix, where each value is the outcome of the former value used as an argument for the specific function.

whatever[10,10]=8+3i

Here two things are happening. First we create a new symbol (called whatever) and then we assign the cell in the 10th row and 10th column to the value 8+3i. So we are able to use indices even on non-existing symbols. We can also expand matrices by assigning higher indices. This is only possible by setting values - we are not available to get higher indices. Here an exception will be thrown.

whatever[2:2:end,1:5]

Here we will get information about all even rows (2:2:end means: start at index 2 and go to the end (a short macro) with a step of 2) with the first half of their columns. The displayed matrix will be a 5 times 5 one with rows 2,4,6,8,10 and columns 1,2,3,4,5.

YAMP does take matrices seriously. Multiplying matrices is done with respect to the proper dimensions. Let's have a look at a sample console application:

Output in the sample console application

The index operator allows us to select (multiple) columns and / or rows from matrices. The following output shows the selection of the inner 3x3 matrix (from the 5x5 matrix).

Output in the sample console application

This concludes our short look at YAMP. Of course YAMP is also able to perform queries like i^i or (2+3i)^(7-i) or 17! correctly.

How YAMP does it

YAMP contains a few important classes:

  • ParseTree
  • Value
  • AbstractExpression
  • Operator
  • IFunction

The most important datatype for accessing YAMP outside the library is the Parser class. We'll introduce this concept later on.

Right now we want to have a short look at the class diagram for the whole library. First of all the class diagram in principle:

Short class diagram of YAMP

The Parser generates a ParseTree, that has full access to the singleton class Tokens. The singleton is used to find operators, functions, expressions and constants, as well as resolving symbols.

YAMP relies heavily on reflection. The first time YAMP is used might be a little bit slow (compared to further usages). Therefore calling the static Load() method of the Parser should be done before any measurements or user input. This guarantees the shortest possible execution time for the given expression.

Any expression will always consist of a number of sub-expressions and operators. Operators can either be binary (like +, -, ...) or unary (like !, [], ', ...). The binary ones need two operands (left and right), while unary ones just need one. An expression might be one of the following:

  • A bracket expression, which contains a full ParseTree again.
  • A number - it can either be positive, negative, real or complex.
  • An absolute value, which is sometime like a bracket expression that calls the abs() function after being evaluated.
  • A symbolic value - either to be resolved (could be a constant or something that has been stored previously) or to be set.
  • A function expression can be viewed as a symbolic value that has a bracket expression directly attached (without an operator between the expression).

This results in the following class diagram:

Expression class diagram

Let's have a look at the class diagram for the operators next:

Operator class diagram

Here we see that the AssignmentOperator is also nothing more than another BinaryOperator. For simplicity and code-reuse another abstract class called AssignmentPrefixOperator has been added as an additional layer between any combined assignment operator (like += or -=) and the original assignment operator. Overall the following assignment operators have been added: +=, -=, *=, /=, \=, ^=.

The difference between the left division (/) and right division (\) is that, while left division means A = B / C = B-1 * C, right division means A = B \ C = B * C-1. The difference between both can be crucial for matrices, where it matters, if we multiply from left or right. The two operands are always of type Value. This type is an abstract class, which forms the basis for the following derived datatypes:

  • ScalarValue for numbers (can be imaginary)
  • MatrixValue for matrices (can be only vectors), which consist of ScalarValue entries
  • StringValue for strings
  • ArgumentsValue for a list of arguments

The functions are also a quite important part of YAMP. The class diagram for the functions look like:

Function class diagram

Here we create a standard type StandardFunction. This type has to implement the interface IFunction, as required. The only thing that will be called is the Perform() method. The StandardFunction contains already a framework that works only with numeric types like MatrixValue and ScalarValue. If the Perform() method won't be changed, every number of a given matrix will be changed to the result of the function that is being called.

Another useful type here is the ArgumentFunction type. This one can be used to create a fully operational overload machine. Once we derive from this class we only have to create one or more functions that are called Function() with return type Value. Reflection will find out how many arguments are required for each function and call the right one (or return an error to the user) for the number of given arguments. Let's consider the RandFunction:

class RandFunction : ArgumentFunction
{
	static readonly Random ran = new Random();
	
	public Value Function()
	{
		return new ScalarValue(ran.NextDouble());
	}
	
	public Value Function(ScalarValue dim)
	{
		var k = (int)dim.Value;
		
		if(k <= 1)
			return new ScalarValue(ran.NextDouble());
		
		var m = new MatrixValue(k, k);
		
		for(var i = 1; i <= k; i++)
			for(var j = 1; j <= k; j++)
				m[j, i] = new ScalarValue(ran.NextDouble());
		
		return m;
	}
	
	public Value Function(ScalarValue rows, ScalarValue cols)
	{
		var k = (int)rows.Value;
		var l = (int)cols.Value;
		var m = new MatrixValue(k, l);
		
		for(var i = 1; i <= l; i++)
			for(var j = 1; j <= k; j++)
				m[j, i] = new ScalarValue(ran.NextDouble());
		
		return m;
	}
}

With this code YAMP will return results for zero, one or two arguments. If the user provides more than two arguments, YAMP will throw an exception with the message that no overload has been found for the number of arguments the user provided.

Implementing new standard functions, i.e., those functions which just have one argument, is easy as well. Here the implementation of the ceil() function is presented:

class CeilFunction : StandardFunction
{
    protected override ScalarValue GetValue(ScalarValue value)
    {
        var re = Math.Ceiling(value.Value);
        var im = Math.Ceiling(value.ImaginaryValue);
        return new ScalarValue(re, im);
    }	
}

The implementation of the BinaryOperator class is quite straight forward. Since any binary operator requires two expressions the implementation of the Evaluate() method throws an exception if the given number of expressions is different from two. Two new methods have been introduced: One that can be overridden called Handle() and another one that has to be implemented called Perform(). The previous one will be used by the standard implementation of the first one.

abstract class BinaryOperator : Operator
{
	public BinaryOperator (string op, int level) : base(op, level)
	{
	}
	
	public abstract Value Perform(Value left, Value right);

    public virtual Value Handle(Expression left, Expression right, Hashtable symbols)
    {
        var l = left.Interpret(symbols);
        var r = right.Interpret(symbols);
        return Perform(l, r);
    }
	
	public override Value Evaluate (Expression[] expressions, Hashtable symbols)
	{
		if(expressions.Length != 2)
			throw new ArgumentsException(Op, expressions.Length);
		
		return Handle(expressions[0], expressions[1], symbols);
	}
}

Let's discuss now how YAMP is actually generated the available objects. Most of the magic is done in the RegisterTokens() method of the Tokens class.

void RegisterTokens()
{
	var assembly = Assembly.GetExecutingAssembly();
	var types = assembly.GetTypes();
	var ir = typeof(IRegisterToken).Name;
	var fu = typeof(IFunction).Name;
	
	foreach(var type in types)
	{
        if (type.IsAbstract)
            continue;

		if(type.GetInterface(ir) != null)
			(type.GetConstructor(Type.EmptyTypes).Invoke(null) as IRegisterToken).RegisterToken();
		
		if(type.GetInterface(fu) != null)
			AddFunction(type.Name.Replace("Function", string.Empty), (type.GetConstructor(Type.EmptyTypes).Invoke(null) as IFunction).Perform, false);
	}
}

The code goes over all types that are available in the executing assembly (YAMP). If the type is abstract, we avoid it. Otherwise we look if the type implements the IRegisterToken interface. If this is the case we will call the constructor, treat it as a the interface and call the RegisterToken() method. This method usually looks like the following (for operators):

public void RegisterToken ()
{
	Tokens.Instance.AddOperator(_op, this);
}

Now we have the case that an operator is searched from our ParseTree. In this case the FindOperator() method of the Tokens class will be called.

public Operator FindOperator(string input)
{
	var maxop = string.Empty;
	var notfound = true;

	foreach(var op in operators.Keys)
	{
		if(op.Length <= maxop.Length)
			continue;

		notfound = false;

		for(var i = 0; i < op.Length; i++)
			if(notfound = (input[i] != op[i]))
				break;

		if(notfound == false)
			maxop = op;
	}

	if(maxop.Length == 0)
		throw new ParseException(input);

	return operators[maxop].Create();
}

This method will find the maximum operator for the current input. If no operator was found (the length of the operator with the maximum length is still zero) an exception is thrown. Otherwise the found operator will call the Create() method. This method just returns another instance of the operator.

Parsing a (simple?) expression

In the following paragraphs we are going to figure out how YAMP is parsing a simple expression. The query is given in form of

3 - (2-5 * 3++(2 +- 9)/7)^2.

We actually included some whitespaces, as well as some (obvious) mistakes in the expression. First of all "++" should just be expressed as "+". Second the operation "+-" should be replaced by just "-". First of all the problem is divided into two specific tasks:

  1. Generate the expression tree, i.e. seperate operators from expressions and mind the operator levels.
  2. Interpret the expression tree, i.e. evaluate the expressions by using the operators.

The parser then ends up with an expression tree that looks like the following image.

The expression tree after the invokation of the parser

In the second step we call the method to start the interpretation of the elements. Therefore the interpreter works and calls the Interpret() method on each bracket. Every bracket looks at the number of operators. If no operator is available, then there has to be only one expression. The value of the expression is then returned. Otherwise the Handle() method of the operator is called, passing the array of Expression instances. Since the operator is either a specialized BinaryOperator or a specialized UnaryOperator, the method will always call the right function with the right number of arguments.

The value tree after the invokation of the interpreter

In this example our expression tree consists of five layers. Even though the interpretation starts at the top level (1), it requires the information of the lowest level (5) to be executed. Once 5 has been interpreted, level 4 can be interpreted, then level 3 and so on. The final result of the given expression is -193.

YAMP against other math parsers

We can guess that YAMP is probably slower than most parser written in C/C++. This is just the native advantage (or managed disadvantage) we have to play with. Therefore benchmarks against parser written in C++ may be unfair. Since Marshalling / Interop takes some time as well, parsing small queries and comparing the result might also be unfair (this time probably unfair for the C/C++ parser). Finally a fair test might therefore only consist of C# only parser. Having a look at CodeProject we can find the following parsers:

We set up the following four tests:

  • Parse and interpret 100000 randomly generated queries*
  • Parse and interpret 10000 times one pre-defined query (long with brackets)
  • Parse and interpret 10000 times one pre-defined query (medium size)
  • Parse and interpret 10000 times one pre-defined query (small)

* The expression random generator works like this: First we generate a number of (binary) operators, then we alternate between expression and binary operator, always randomly picking one. The expression is just a randomly picked number (integer) and the operator is a randomly chosen one from the list of operators (+, -, *, /, ^). In the end we have a complete expression with n operators and n+1 numbers.

The problem with this benchmark is that only 3 (YAMP, MathParser.NET, and LL Mathematical Parser) parsers have been able to parse all queries (even excluding the ones using the power operator). Therefore this benchmark was just able to give a speed comparison between these three. However, we can use this test to identify that YAMP is really able to parse a lot of different queries.

If we look at the time that is required per equation in dependency of the number of operators, we see that YAMP scales actually pretty well. MFP and LLMP are significantly better here, however, MFP could not parse much at all and LLMP does not support complex arguments nor does it support matrices.

Time per equation in dependency of the number of operators

The biggest problem with the fastest implementations are the constraints given by them. They are hard to extend and less flexible than YAMP. While YAMP does contain a lot of useful features (indices, assignments, overloaded functions, ...), none of the other parsers could come even close to the level of detail.

Total time for each benchmark

The longer (and more complex) the benchmark, the better for YAMP. This means that YAMP performs solid in the region of short expressions (where no one will notice much about great performance), while it performs great in the region of long expressions (where most would notice delays and other annoying stuff otherwise).

YAMPMPLLMPMPTKMP.NETMFP
100000 (R)57163728471611xxx
10000 (L)543145081778505444x
10000 (M)39847951675635122147
10000 (S)131338581136519454

In the table shown above we see that YAMP can parse all equations and does perform quite well. Considering the early stage of the development process and the set of features, YAMP is certainly a good choice for any problem involving numerical mathematics.

How to use YAMP in your code

If we use YAMP as an external library we need to hook up every chance we want to do. Luckily for us, this is not a hard task. Consider the case of adding a new constant.

YAMP.Parser.AddCustomConstant("R", 2.53);

This way we can override existing constants. If we remove our custom constant later one, any previously overridden constant will be available again. This way we can also add or remove custom functions. Here our own function needs to have a special kind of signature, providing one argument (of type Value) and returning an instance of type Value. The following example uses a lambda expression to generate a really simple function in one line:

YAMP.Parser.AddCustomFunction("G", v => new YAMP.ScalarValue((v as YAMP.ScalarValue).Value * Math.PI) );

What does it take to actually use YAMP? Let's see some sample first:

try
{
    var parser = YAMP.Parser.Parse("2+(2,2;1,2)");
    var result = parser.Execute();
}
catch (Exception ex) // this is quite important
{
    //ex.Message
}

So everything we need to do is to call the static Parse() method in the Parser class (contained in the YAMP namespace). The method results in a new Parser instance that contains the full ParseTree. If we want to obtain the result of the given expression, we only need to call the Execute() method.

Points of interest

YAMP makes use of reflection to make extending the code as easy as possible. This way adding an operator can be achieved by just adding a class that inherits from the abstract Operator, BinaryOperator or UnaryOperator class or any other operator that is available.

YAMP is impendent of the set culture (the US numbering style has been set as numbering format explicitly). Strings and others are stored with UTF8 encoding. However, symbols can only start with a letter (A-Za-Z) and can then only contain letters and numbers (A-Za-z0-9).

The project is currently under heavy development, which results in a quite fast update cycle (one to two new versions per week at least). This means that you should definitely consider taking a look at the GitHub page (and maybe this article), if you are interested in the project.

Generally YAMP will require more interested people working on it (mostly on some numeric functions). If you think you can contribute, then feel free to do so! Everyone is more than welcome to contribute to this project.

History

  • v1.0.0 | Initial release | 19.09.2012.
  • v1.0.1 | Added a table with benchmark data | 19.09.2012

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Florian Rappl
rsi Software
Germany Germany
Member
Florian is from Regensburg, Germany. He started his programming career with Perl. After programming C/C++ for some years he discovered his favorite programming language C#. He did work at Siemens as a programmer until he decided to study Physics. During his studies he worked as an IT consultant for various companies.
 
Florian is also giving lectures in C#, HTML5 with CSS3 and JavaScript, and other topics. Having graduated from University with a Master's degree in theoretical physics he is currently busy doing his PhD in the field of High Performance Computing.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions


Discussions posted for the Published version of this article. Posting a message here will take you to the publicly available article in order to continue your conversation in public.
 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionError with subtractmemberChristian Indra19 Apr '13 - 2:53 
Hi
There is an error when I try to solve a string like "10 - 3 -4".
The result should be 3 but YAMP returns 11!
AnswerRe: Error with subtractmvpFlorian Rappl19 Apr '13 - 6:21 
Are you using this version or the current version over NuGet / Github?
 
This is just a sample and not a production release. Computing your query with the current release yields the correct result (3).
 
I recommend you to use the version available over NuGet, since this will automatically give you updates and bug-fixes.
GeneralMy vote of 5memberDotNetMastermind24 Nov '12 - 13:18 
very detailed and well-written article
GeneralRe: My vote of 5memberFlorian Rappl24 Nov '12 - 22:45 
Thank you very much!
GeneralMy vote of 5memberBen3.Tyo15 Oct '12 - 15:42 
very nice~~thx
GeneralRe: My vote of 5memberFlorian Rappl15 Oct '12 - 20:01 
Thanks Rose | [Rose] !
SuggestionFunction description in help() outputmemberpmarchois7 Oct '12 - 3:39 
hi,
 
it should be useful to add function description after its name in the output of help() function.
 
for example:
[Description("Shows detailed help for all functions.")]
class HelpFunction : ArgumentFunction
{
  const string SPACING = "   ";
 
  [Description("Shows a list of all out-of-the-box provided functions.")]
  [Example("help()", "Lists all functions.")]
  public StringValue Function()
  {
    var sb = new StringBuilder();
    //sb.AppendLine("List of available methods ...");
    //sb.AppendLine("---");
    var methods = Tokens.Instance.Methods.Select(m => m.GetType().Name.RemoveFunctionConvention().ToLower()).OrderBy(m => m).AsEnumerable();
 
    foreach (var method in methods)
    {
      var element = Tokens.Instance.Methods.Where(m => m.GetType().Name.Equals(method + "Function", StringComparison.InvariantCultureIgnoreCase)).Select(m => m.GetType()).FirstOrDefault();
      if (element == null)
        throw new FunctionNotFoundException(method);
 
      sb.Append(method).Append("\t : ").AppendLine(GetDescription(element).Replace("Description:\r\n", string.Empty));
    }
 
    //sb.AppendLine("---");
    return new StringValue(sb.ToString());
  }
 
philippe
GeneralRe: Function description in help() outputmemberFlorian Rappl15 Oct '12 - 20:01 
The latest version now contains this feature!
GeneralMy vote of 5memberDrABELL6 Oct '12 - 16:59 
Hi Florian,
Excellent article, solid 5*!
I am just wondering, if this parser can be modified to recognize/process fractions and mixed numbers?
Thanks and regards,
Alex
GeneralRe: My vote of 5memberFlorian Rappl6 Oct '12 - 22:33 
I do not know what you mean with mixed numbers (rational numbers?), but fractions would be no problem. All you need to do is modify the division operator (in this case you would also need to modify the Divide() methods of datatypes like MatrixValue or ScalarValue) and create a new datatype, called FractionValue. One could then imagine, that a ScalarValue divided by a ScalarValue would result in a FractionValue (which is a derived type of ScalarValue or something like this), which contains the nominator and denominator values. You would then be required to specify the operations etc.
 
If you meant rational numbers, then this is harder, since you have to store the basic operation in a new datatype (called "RationalValue"). This datatype also includes a lambda expression for the operation that should have been performed. Like the Sqrt(2) stores 2 and the Sqrt. If you power that to 2, the outcome is 2 again (Sqrt()) vanishes. But this is just one case. You would need to include an algorithm to determine if the principle outcome is a rational number, and if a new operation applied to the rational number results again in a rational number or a normal number.
GeneralRe: My vote of 5memberDrABELL7 Oct '12 - 3:25 
Hi Florian,
Thanks for your response. Mixed numbers, or improper fractions are, indeed a subclass of rational numbers. Their typical notation is like 5 1/3, which is equivalent to 16/3. I've built a Fraction Calculator, which operates on any combinations of fractions, decimals or mixed numbers; it operates on a new data type "Fraction" that I am using (or rational number as per your definition), includes a parser and a bunch of operator/function overloads (+, -, ToString(), etc.). You can see it on Codeproject: Edumatter M12: School Math Calculators and Equation Solvers[^] So, this part I pretty much understand.
My question was very specific: if your parser in its current version can handle complex math expressions that in addition to decimals/integers also contain fractions and mixed numbers, let's say tan(5 1/3) + tan(16/3) +sin(1/3)?
Thanks and regards,
Alexander
GeneralRe: My vote of 5memberFlorian Rappl7 Oct '12 - 5:50 
In its current state the parser can handle everything totally fine, however, all numbers will be evaluated, resulting in 0.3333... (up to double precision) for 1/3 etc.
 
To avoid something like this you could do as I specified and write your own FractionValue class. The expression you wrote won't do different with such a class, however, since sin(...) will be evaluated numerically. You could also alter the sin function (and others), to support Fractions explicitly (without taking their Scalar value, which would be the evaluated value), but here you will be in big trouble, since sin (as cos) is an infinite series.
 
Hope my answer helps you!
Florian
GeneralRe: My vote of 5memberDrABELL7 Oct '12 - 7:00 
Thanks, Florian, I got the point! Actually, it's already done in my Fraction Calculator; there are two interrelated classes: "Fraction" and derived class "MixedNum" (that contains whole and fractional parts) which store the value of any rational number and perform the operators' overloads (including mix-to-improper fraction conversion), so it returns the result of math operations shown in both fractional (if the result is rational) and decimal forms. I was just wondering how that parser in your article will process mixed numbers, like for e.g. 5 1/3, or 5     1/3 with multiple blanks between whole and fractional parts, or 5   1   / 3 with more extra blanks? Does it have the ability to recognize it as corresponding rational number?
GeneralRe: My vote of 5memberFlorian Rappl7 Oct '12 - 11:16 
Spaces are being ignored by the parser (right now). Right now I do have to say that I have no clue how one could teach the parser to recognize 5 1/3 as 16/3 without changing too much. Problem with a special space operator (which I was thinking of first) is that it could probably violate the fixed EXP OP EXP or EXP OP OP etc. pattern that binary / unary operators have. The only thing that would work then is to make the parser more strict, i.e. spaces can only be used before fractions. But that is against my will, that the parser is able to parse a lot of "valid" expressions, even though there are wild whitespaces and expressions like 3//2 (is 3 / (1 / 2) = 3 * 2) and 3++2 (is 3+ 0 + 2= 3+2) etc.
GeneralRe: My vote of 5memberDrABELL7 Oct '12 - 11:40 
The problem is actually solved in my Calculator for relatively simple computations. As I pointed out before, there are two additional classes, so the parser first convert input string (e.g. 5 1/3) into appropriate "MixedNum" class, and then computational engine will do the rest. That "Fraction/MixedNum" computation engine has been around for several years working just fine. I will extend it in order to handle more complex expressions sometime soon and post couple articles on this topic. Anyway, thanks for your response and kind attention. Have a great weekend. Best regards, Alex
SuggestionPercent operatormemberpmarchois3 Oct '12 - 4:43 
another suggestoin, add le % operator.
 
3% = 0.03
 
this is not the modulo operator
 
is there another way to send you suggestions ?
 
class PercentOperator : UnaryOperator
  {
    public PercentOperator()
      : base("%", 1000)
    {
    }
 
    public override Operator Create()
    {
      return new PercentOperator();
    }
 
    public override Value Perform(Value left)
    {
      if (left is ScalarValue)
        return new ScalarValue(((ScalarValue)left).Value / 100.0);
 
      throw new OperationNotSupportedException("%", left);
    }
  }

GeneralRe: Percent operatormemberFlorian Rappl3 Oct '12 - 5:24 
You can either make contributions directly over GitHub or post issues on GitHub. All you need is a GitHub account!
 
I am still uncertain for the percent operator. I was first thinking of a modulo operator, but I am not quite sure. I think I will leave this topic open for now!
 
Your implementation seems to be really great in case of a calculator for prices, tax, etc. - so probably leaving it open allows every one to use the percent operator for his own purpose.
SuggestionRe: Percent operatormemberZac Greve25 Oct '12 - 10:00 
I have been thinking on this, and the percent operator only has a value on the left (e.g. 3% = 0.03 or 3% * 100 = 3), while the modulo operator has values on the left and right (e.g. 3 % 2 = 1.) How hard would it be to check if there is a value on the right and select an operator accordingly? (e.g. no value on right = percent/value on right = modulo).
I think computer viruses should count as life. I think it says something about human nature that the only form of life we have created so far is purely destructive. We've created life in our own image.
Stephen Hawking

GeneralRe: Percent operatormemberFlorian Rappl25 Oct '12 - 10:53 
Right now there are just two types of operators - binary and unary. There is no such thing as a "either unary or binary", and I have no clue how difficult the implementation would be. Right now I am thinking: It is possible and has interesting opportunities.
 
I will definitely think about it - right now I am busy with the Sumerics application; which kind of involves development on YAMP, but only when YAMP needs to evolve for Sumerics.
SuggestionFunctions outside YAMP assemblymemberpmarchois3 Oct '12 - 4:24 
hi,
 
it would be great to scan all loaded assemblies in RegisterTokens() to be able
to add any functions outside YAMP assembly (to integrate new yamp version easily).
 
i've done this by modifying Token.RegisterTokens() :
void RegisterTokens()
    {
      var ir = typeof(IRegisterToken).Name;
      var fu = typeof(IFunction).Name;
      var af = typeof(ArgumentFunction);
      var mycst = new Constants();
      var props = mycst.GetType().GetProperties();
 
      Assembly[] asms = AppDomain.CurrentDomain.GetAssemblies();
      foreach (var assembly in asms)
      {
        //var assembly = Assembly.GetExecutingAssembly();
        var types = assembly.GetTypes();
 
        foreach (var type in types)
        {
          if (type.IsAbstract)
            continue;
 
          if (type.GetInterface(ir) != null)
            (type.GetConstructor(Type.EmptyTypes).Invoke(null) as YAMP.IRegisterToken).RegisterToken();
 
          if (type.GetInterface(fu) != null)
          {
            var name = type.Name.RemoveFunctionConvention().ToLower();
            var method = type.GetConstructor(Type.EmptyTypes).Invoke(null) as YAMP.IFunction;
            if (method != null)
            {
              methods.Add(method);
              AddFunction(name, method.Perform, false);
 
              if (type.IsSubclassOf(af))
                argumentFunctions.Add(name);
            }
          }
        }
      }
 
      foreach (var prop in props)
        AddConstant(prop.Name, prop.GetValue(mycst, null) as Value, false);
 
      sanatizers.Add("++", "+");
      sanatizers.Add("--", "+");
      sanatizers.Add("+-", "-");
      sanatizers.Add("-+", "-");
      sanatizers.Add("//", "*");
      sanatizers.Add("**", "*");
      sanatizers.Add("^*", "^");
      sanatizers.Add("*^", "^");
      sanatizers.Add("*/", "/");
      sanatizers.Add("/*", "/");
    }
 
perhaps, is there better solutions...
 
Philippe
GeneralRe: Functions outside YAMP assemblymemberFlorian Rappl3 Oct '12 - 5:21 
Exactly what I plan to do (at least from the thinking).
 
I will today (in the next minutes / hours) write a function that will probably be called something like "RegisterFromAssembly(Assembly)". There you can add whatever assembly you want to. It will perform the register token part again (on that assembly) and add / overwrite existing functions and operators and stuff.
GeneralMy vote of 5memberRajuBhupathi30 Sep '12 - 22:29 
Very good article, solved many of my daily problems
GeneralRe: My vote of 5memberFlorian Rappl30 Sep '12 - 23:59 
Thanks! Rose | [Rose]
GeneralMy vote of 5memberJF201526 Sep '12 - 2:06 
Again an excellent article!
GeneralRe: My vote of 5memberFlorian Rappl26 Sep '12 - 5:33 
Thanks a lot ! Thumbs Up | :thumbsup:
GeneralMy vote of 5memberKenneth Haugland25 Sep '12 - 22:02 
Like the article.
 
I have written something that just parse Complex numbers using RegEx only, but you seems to use both RegEx and SubString?
GeneralRe: My vote of 5memberFlorian Rappl26 Sep '12 - 0:59 
Yes I use a combination to allow more complex expressions and also to make the parser more extensible.
 
I also tried using StringBuilder instead of always passing string instances, which are immutable. However, the performance stays the same. What makes a difference is the usage of "StartsWith()". This method is somehow implemented inefficiently (for my case), since I could gain a lot of performance by avoiding it with easier tests.
GeneralMy vote of 5memberLWessels25 Sep '12 - 7:22 
Very good article.
 
The links to MP.NET and MFP doesn't seem to work for me.
 
For the 2^2^2^2:
"Now we get the same result as Google shows us - 65356." You probably meant 65536
GeneralRe: My vote of 5memberFlorian Rappl25 Sep '12 - 9:04 
Thanks a lot!
I fixed the typo regarding the links (the attribute was spelled hef instead of href) and I also fixed the typo regarding the result of 2^2^2^2 - 65536 is of course correct.
 
Thanks again! Rose | [Rose]
QuestionFunctions with StringValue argumentsmemberMember 156399524 Sep '12 - 8:26 
hi,
 
thank you for sharing such a great job.
 
i'm trying to add class to execute functions
having (for example) 3 arguments :
Foo(string t, double x, double i)
 
i've declared a class FooFunction
with 1 member Function(StringValue t, ScalarValue x, ScalarValue i)
 
but when i enter Foo("titt", 12, 0.02), the parser complains about the ',' operator
(code = OperationNotSupportedException.cs line 14).
=> it seems that StringValue arguments are not allowed for functions having more
than 1 argument.
 
is it correct ?
what's wrong with my code ?
 
i've tried to debug :
the parser evaluates args as Foo ["titt", [12,0.02]]
it doesn't understand the coma between string and number.
if i put the string argument at the end, the error remains the same
 
best regards,
philippe
AnswerRe: Functions with StringValue argumentsmemberFlorian Rappl24 Sep '12 - 9:25 
Is the class where you want to specify the argument function also derived from ArgumentFunction?
 
Using this code:
 
using System;
 
namespace YAMP
{
    class FooFunction : ArgumentFunction
    {
        public Value Function(StringValue t, ScalarValue x, ScalarValue i)
        {
            return t;
        }
    }
}
 
I was able to do the following statement:
 
foo("a", 2, 3)
 
which resultet in the string "a" being returned. Maybe you are inheriting from a different function. Hope this helps you!
 
Florian
 
PS: In ArgumentFunction the Comma is treated differently than in StandardFunctions. The reason for this is the following: I still wanted to write matrices etc. in a simple way using commas - but I also needed a way to separate different arguments for argument functions... Therefore the parser decides which kind of method it is and switches from matrix mode to argument mode for ArgumentFunctions. If the parser does not switch it is still in matrix mode, i.e. commas are separating columns.
 
This behavior results in an exception in your case (though a wanted exception) - since only numeric values can be placed in a matrix (like the last 2 columns).
GeneralRe: Functions with StringValue argumentsmemberpmarchois24 Sep '12 - 23:08 
i've downloaded the last version (published this weekend) on github. it now works Smile | :)
 
i've made change to allow '_' in function name :
FunctionExpression.cs(14):
public FunctionExpression () : base(@"[A-Za-z]+[A-Za-z0-9_]*\(.*\)")
SymbolExpression.cs(33):
public SymbolExpression () : base(@"[A-Za-z]+[A-Za-z0-9_]*\b")
 
if parameters entered by the user are not of the correct type, an exception is thrown (ArgumentFunction.cs(80).
i've made this changes to be more user friendly :
if (functions.ContainsKey(args))
{
var method = functions[args];
 
try
{
#if PHM
// PhM 25/09/2012 : Check arguments type
ParameterInfo[] pis = method.GetParameters();
for (int i = 0; i < pis.Length; i++)
{
if (pis[i].GetType() != arguments[i].GetType())
throw new Exception(method.DeclaringType.Name.Replace("Function", "") + " : parameter #" + (i + 1).ToString() + " is invalid !");
}
#endif
// call the function
return method.Invoke(this, arguments) as Value;
}
catch (Exception ex)
{
#if PHM
if (ex.InnerException == null)
throw ex;
else
#endif
throw ex.InnerException;
}
}
GeneralRe: Functions with StringValue argumentsmemberFlorian Rappl25 Sep '12 - 6:38 
Looks great!
I will include those changes in the code, since they seem very appealing Smile | :)
 
Thanks Rose | [Rose]
GeneralMy vote of 5memberSamarRizvi23 Sep '12 - 10:00 
Nice article, good explanation
GeneralRe: My vote of 5memberFlorian Rappl23 Sep '12 - 20:33 
Thanks! Rose | [Rose]
GeneralMy vote of 5memberAkram El Assas23 Sep '12 - 3:52 
Great Article. I like matrix syntax, simple and to the point.
GeneralRe: My vote of 5memberFlorian Rappl23 Sep '12 - 4:00 
Thanks a lot! I tried to make it simple and effective so your comment means a lot for me Thumbs Up | :thumbsup: .
GeneralMy vote of 5memberDrABELL21 Sep '12 - 6:15 
Excellent article in every aspect, and just in time! Solid 5*
GeneralRe: My vote of 5memberFlorian Rappl21 Sep '12 - 10:21 
Thanks a lot!
GeneralRe: My vote of 5memberDrABELL21 Sep '12 - 11:12 
You are welcome! I look forward to reading more of your articles. Best, AB
GeneralMy vote of 5memberAditya_Pandey21 Sep '12 - 2:16 
Excellent !
GeneralRe: My vote of 5memberFlorian Rappl21 Sep '12 - 5:21 
Thanks !Thumbs Up | :thumbsup:
QuestionMy vote of 5memberAlexCode20 Sep '12 - 20:36 
Still have to properly put it to work but looks one of the best parsers around.
 
Nice work! Smile | :)
AnswerRe: My vote of 5memberFlorian Rappl20 Sep '12 - 20:50 
Thanks a lot - I appreciate your kind words! Rose | [Rose]
QuestionIF, LOGIC, DATE and For Loop FunctionmemberMichael Moreno20 Sep '12 - 19:10 
Hello,
 
This seems very impressive! I tried opening the project on Mono Develop on a Mac but it fails with one library saying 'not built in active configuration'. So I cannot try it until I get my hand on a PC later on. Hence a few questions if you do not mind too much.
 
I browsed the code quickly and could not find some key functions which I was expected to see out of the box given how advanced YAMP seems to be:
- IF Statement which would let us do things such as IF(X>0,1/X,0)
- Logical function: AND, OR, XOR
- Dates functions where we could set a DateTime variable and then query it such as: IF(AND(MONTH(MYDATE)=2,DAY(MYDATE)=29),1,0)
 
Going a bit further and again with really high expectations from what seems to be an amazing parser, I could not find the following out of the box:
- Breaking Statement to throw exception from the formula itself as in IF(X<0, THROWEX("X CANNOT BE NEGATIVE")) - this makes it far easier to know why something fails when the expression is large
- FOR LOOP functions with LOOPINDEX built in variable as in FORLOOP(0,10,MyOtherFunction(X,LOOPINDEX))
 
Perhaps most of these functions could be added by the user but then maybe it might make sense to have them defined out of the box.
 
Many thanks for your help and for sharing your work.
 
Michael M
AnswerRe: IF, LOGIC, DATE and For Loop FunctionmemberFlorian Rappl20 Sep '12 - 20:23 
Thanks for your kind comment!
 
Now to your questions:
- I programmed most of the library in MonoDev. on Snowleopard; the problem here is (same with VS btw...) that de-activated projects (for a specific built type) are still required. Therefore in order to get around you can (one of many options) just go to the project options and change the mapping of the output configuration, so that all projects (in this case the YAMP.Compare is missing) are built.
 
- I am thinking about a kind of scripting language for YAMP (like MATLAB has). At the moment only single-line statements are possible... Therefore if, else, ... do not make much sense. But they will be included in a future version, when you can write multiple line queries.
 
- The answer before includes an answer to AND; OR, XOR, ...
 
- I will think about date functions and / or a date datatype!
 
- Also break; loops etc. are part of more advanced (scripting) languages - I will include them for sure once I have included multiple line statements.
 
Thanks and all the best,
 
Florian
GeneralRe: IF, LOGIC, DATE and For Loop FunctionmemberMichael Moreno21 Sep '12 - 9:47 
Florian,
 
no, thanks to you for sharing this little Gem.
 
If you look at Excel or any spreadsheet app, you can only write one single line statement per cell, but that does not mean that the statement cannot be really long. All spreadsheet apps provide IF, AND, OR, XOR and DATE function and is part of the reasons why they are so useful and powerful. The IF function is key IMHO to a math parser in the 'real world'. As an example, the Indicator function 1[X>b] is often needed in integral or derivative calculus this is simply writen in Excel as IF(X>b,1,0)
 
Math Parsers are used to extend Apps, not to calculate simple formulas. Whereas the algorithms in C# are compiled and fixed, the Math expression thanks to the Parser is not and this is why the IF function and logic functions are key in a Parser. They solve issues that the C# coder cannot solve in advance and make the app truly extensible.
 

Personally I see no reason why you would want to do scripting and would suggest you do not try to go down that route. IronPython and the DLR already solve this extremely complex problem.
 

Your work so far is very humbling and I thank you very much for showing it to us.
 

Regards,
Michael M
GeneralRe: IF, LOGIC, DATE and For Loop FunctionmemberFlorian Rappl21 Sep '12 - 10:30 
I partially agree with you. But in my opinion "if" in Excel is a bad joke. There are several reasons for my opinion. The first is that Excel functions are localized (who did implement that?!). The second one is that it is actually a function - therefore "if" results in statements like if(bool,do that, do otherwise), which still looks clean; however, who needs one if, will most probably need another and therefore you get some nasty, unreadable statements out of it.
 
Why would I want to go for scripting? Well, just look at MATLAB (which is the reference for numerics). Without the scripting possibility, MATLAB would not be as successful as it is. I need multiline statements for that, since I currently think of a kind of mix between the MATLAB syntax and Python like statements, i.e. new lines and indentations are part of the syntax. However, I did not sort my thoughts completely (and I did not start implementing anything regarding this issue), which means there is some discussion left.
 
I really thank you for your interesting comment. I will think about it carefully before I will any step towards a solution.
 
BTW: Logical statements are already included (including logical subscripting). If you do the following statements:
 
x = randi(5, 0, 20)
x[isprime(x)] = 0
 
then every prime number that is included in this 5x5 integer matrix is set to 0. Also consider this
 
x = randi(5, -10, 10)
x[x > 0] = 0
 
then every positive number in this 5x5 integer matrix is set to 0.
 
Kind regards,
Florian
SuggestionRe: IF, LOGIC, DATE and For Loop FunctionmemberZac Greve22 Sep '12 - 13:52 
I was thinking about this, and I would like to suggest having a core library with the one line statement support, basic functions, and the core interfaces, and several other libraries that implement the scripting, more advanced functions, and various other things.
 
The application using this project could just reference what it needs, and the referenced libraries could have an initialization function that registers them into a static plugin manager class.
 
This way, applications could include only the functionality that they need, along with keeping the resulting output size down somewhat.
 
You don't have to do it this way, of course, but it might be benificial in the long run.
 
(Please forgive errors, my iPad's spell checker doesn't work correctly)
I think computer viruses should count as life. I think it says something about human nature that the only form of life we have created so far is purely destructive. We've created life in our own image.
Stephen Hawking

GeneralRe: IF, LOGIC, DATE and For Loop FunctionmemberFlorian Rappl22 Sep '12 - 22:11 
Thanks for your comment Zac - it goes in the direction I wanted to go and I will definitely do it look that. The only question now is what other functions will be included in the core library. At the moment I already started a YAMP.Numerics namespace that will contain some of the (More advanced?) numeric methods. I will probably include a basic list of those functions in the core library.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130516.1 | Last Updated 20 Sep 2012
Article Copyright 2012 by Florian Rappl
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid