(untagged)

Math Expressions Evaluator

DetoX83

0.00/5 (No votes)

28 Mar 2010

Library that evals complex mathematical expressions into values

Download source - 157.28 KB

Introduction

In this article, I'll describe my C# math evaluator called "HInterpreter". I won't go deep into the algorithm in a step-by-step manner, instead I'll try to describe as much information as possible to enable you to have a clear view as to how it works and how developers can extend it.

Background

Here's what is most important - what is this for? Well - from the programmer point of view, the "brain" of the entire library is the Interpreter class which has one public method:

public double Calculate(string input)

By giving the input string, which is supposed to be a math expression, the function returns the calculated value.

Before I start with the implementation details, here are some facts that can give more view about possibilities:

Library can evaluate any math expression - there is no max-length limit
Spaces insensitive, eg: "10 - (2*8) / (20 +1)"
Evaluate any type of brackets, e.g.: "2+3*[7-1] / {10%2}"
Evaluate float numbers, e.g.: "3.65 * 2 / (1-0,1)"
Evaluate functions, e.g.: "1 + sin(4*(sqrt(9) + abs(-2))) + Exp(7)"
Evaluate mathematical constants, e.g.: "3 + cos(pi*2) + e"
Easy for extensions (see Extensions chapter for more info.)

To calculate the expression - the expression itself must be valid! Otherwise exception will occur.

Algorithm

To obtain such logic, I used RPN (Reverse Polish notation). It is a mathematical notation wherein every operator follows all of its operands. It is also known as Postfix notation and is parenthesis-free as long as operator arities are fixed.

That means - given the expression:

"5 + ((1 + 2) * 4) - 3"

Program transforms it to:

5 1 2 + 4 * + 3 -

The expression is evaluated left-to-right. An interesting fact is that RPN is quite simple to achieve. What needs to be done first is create the postfix notation from typical "infix". In this case - second algorithm is used, called Shunting-yard algorithm. It is quite complicated, if you are interested in details, check here.

Implementation

The entire algorithm is written within the Interpreter class. However this is not the only class available in the library. To achieve easy extensibility, the entire library is split into logic classes:

Interpreter class
This is the main high-level class that is used to calculate values from expression. It contains algorithm implementation. Public Calculate() method has already been described. If you only want to use library (not extend), that's the only class you should know.
Token class
Token is the abstract class. It represents any single, atomic part of expression. That means: function, operator, number value... but also left and right brackets.
TokenFactory
This class is used to recognize part of the expressions. By given string, TokenFactory returns the appropriate object (more information in the next chapter).

Class Hierarchy

As said, the library is built for extension. Before we get to the actual "how to extend" scenario, have a quick look at the below diagram. It shows the library class hierarchy (well, part of it - that inherits from Token abstract class).

Token Class

As you can see, Token is the root class for all expression parts. Every atomic part of the expression, that we want to distinguish, must inherit from Token. Token code implementation is very short:

public abstract class Token {

       public readonly string Entry;

       public Token(string entry) {
           Entry=entry;
       }
   }

Nothing fancy here - just one readonly field called "Entry". That field contains a string corresponding to a particular expression.

Example: In expression "1+sin(4)", we can distinguish 4 tokens:

value 1
operator +
sinus function
function argument

For these tokens, their Entries are : "1", "+", "sin", "4".

NumberBase Class

This is the abstract class for all constant values written in the expression. For instance: it has an implementation for the "pi" number (3.14), "e" number, as well as typical math value - Number class. Number class is the representation of any number simply written in the expression. If you want to add your own math contants, you should definitely inherit from the NumberBase class.

And below is an example of implementation - PI number:

class Pi: NumberBase {

        public override double Value {
            get {
             &nbs

p;  return Math.PI;
            }
        }

        public Pi(string value)
            :base(value){
        }
    }

OperatorBase Class

No puzzle here, this class is to manage operators behaviour. Let's look on the implementation:

    abstract class OperatorBase: Token{

        public virtual int Precedence {
            get {
             &nbs

p;  return 1;
            }
        }

        public virtual Associativity 

Associativity {
            get {
             &nbs

p;  return Associativity.Left;
            }
        }

        public OperatorBase(string value)
            :base(value) {
        }

        public abstract double Calculate(double 

left,double right);
    }

As you can see, some more complicated stuff is present here. The major part is abstract Calculate method that every operator must implement. Besides that, we see:

Precedence. Value of that field manages operators precedence - for example: multiplication must be done before addition (talking about parameterless expression). Default value of the precedence is 1. For instance: multiplication has precedence of 2, and power has 3. Thus, such expression: "1+2*4^3" acts properly (first power, then multiplication and then addition).
Associativity. This field is of type Enum with two values possible: Left and Right (default is Left). That mean - when calculating following: "1+2+3", first 1+2 is calculated, and result is added to 3. Particular operators can override it. Example: power operator has Right associativity.

Example of implementation - operator "-" (Subtraction):

  class Subtraction:OperatorBase {
        public Subtraction(string value): base

(value) {
        }

        public override double Calculate(double 

left,double right) {
            return left-

right;
        }
    }

FunctionBase Class

This is the base class for all functions (like sin, cos, round, abs, sqrt, exp, etc..). Let's look at the code:

abstract class FunctionBase: Token {

        public virtual int OperandsCount {
            get {
             &nbs

p;  return 1;
            }
        }

        public FunctionBase(string value)
            : base(value) {
        }

        public abstract double Calculate(params 

double[] args);
    }

Similar to operators, the main function that needs to be overridden is the Calculate function. It takes 1 parameter: array of double. This parameter represents operands of a function (although currently library can manage only functions of 1 operand, this parameter was designed for the future. That means OperandsCount property is not implemented yet, and should not be overridden).

A Quick Look At the Sample Function

class Cosinus: FunctionBase {

        public Cosinus(string value)
            : base(value) {
        }

        public override double Calculate(params 

double[] args) {
            return 

Math.Cos(args[0]);
        }
    }

Extension

If you already read the entire article above, it should be clear as to how to extend the library.

If you want to add a new operator - inherit from OperatorBase class
If you want to add a new function - inherit from FunctionBase class
If you want to add a new math constant value - inherit from NumberBase class

But declaration of new classes is not enough. If you want them to be used in Interpreter, one more class needs to be extended - it's the TokenFactory class.

TokenFactory is responsible for creation of appropriate Token object based on part of the expression. Extending it is very simple. Here's a part of the TokenFactory code that should explain every unknown:

RegisterToken<Power>(x => x=="^");
RegisterToken<E>(x => x=="e");
RegisterToken<Sinus>(x => x=="sin");
RegisterToken<Cosinus>(x => x=="cos");

What is shown above is the part of the static constructor of TokenFactory class. You don't have to go deep into the RegisterToken function - simply add a new entry and the corresponding expression. In the example, we see registration of power operator (that is "^" char), registration of Euler number ("e" char) and function -sin and cos. That means: where "sin" expression is found - it's considered as a sinus function. Notice that you don't have to specify whether a particular token is a number, function or operator. Just specify a token type (as a generics type parameter), and expression representation as string. That's it.

Conclusion

What do I say .. I hope you like my library. A small request before comments / rating - please don't blame my English grammar skills (or lack of it...) - it's not perfect yet.

History

2^nd September, 2009: Initial post
27^th March, 2010: Updated source code

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here