## Introduction

I remember the good old college days, in particular the first numerical algorithm class of mine. The first assignment of that class
was to create a mathematical calculator which took in expressions to evaluate. The application would then be used for other assignments during the semester
(note this was the first assignment of the semester on the first day of class) in which we would substitute common mathematical functions for algorithms we wrote.
With only taking one programming class so far in college, the assignment felt daunting. The first question that came to mind was, "If this is a math class,
why can't we just write the numerical algorithms in MatLab and just show its simple usages there". I felt the
task of creating a calculator just like my precious TI 92 to be totally overwhelming.

In retrospect, if I had taken the time to define requirements for the assignment, I would have realized the assignment was pretty trivial. The amount of work to be done
would not have been scary, and the code produced for the assignment would not have been as ugly and full of bugs, as it ended up being.

The issue I faced when I was given that assignment was that I had no base point at which
to begin thinking of a solution (yes, I remember BODMAS (bracket, off, division, multiplication, addition, and subtraction)). I however didn't know how to structure
the requirements needed into a suitable form or grammar, so that I could focus only on fulfilling the rules created for the grammar.
I was more bothered about functionality and how to parse every corner case I could think of.

A few years later, while reading the book "Algorithms and Data Structures in C++ by Leendert Ammeraal", specifically Chapter 11 (which discusses
about the fundamentals of interpreters and compilers), I decided to revisit that dreadful assignment of mine.

## Background

There are two concepts I used in creating the mathematical expression evaluator detailed in this article. One is the use of BNF to come
up with a grammar suitable for the creation of the mathematical expression and the second is recursive decent parsing which I used to translate the grammar to code.

I will not claim to be a grammar expert, as such the rules I defined can be refined as the reader wishes (and any flaws I make can be forwarded to me and I will fix it).
I sort of followed the rules from the Wikipedia page on EBNF http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form.
The reader should however note that the implementation in this article matches my grammar rules.

Below is the grammar implemented. I will presume the grammar is pretty straightforward, as such I will only mention a few assumptions I made within the grammar.

Firstly, I presume that *Variable*s do not contain *Digit*s as part of the name (Variables can be 1 or more *Alphabet*ical letters).
I presume that the Unary operator "-" (e.g., -1 or -ab) is a factor defined by ( "(-*Expression*", e.g., (- 1) ,
((-a) + b), or (-ab)). I presume that a *Function* is part of *Term* and it can be decomposed back to a factor. The only other assumption
I believe worth mentioning is the factorial "!" operator, that is made optional for *Term*s. I had a choice of either moving this
to *Factor* or *Term*, my final decision was to place it within the definition of *Term*.

INPUT = EXPRESSION
EXPRESSION = TERM { "+" | "-" TERM }
TERM = (FUNCTION | FACTOR) { "*" | "/" | "%" | "^" FUNCTION | FACTOR } { "!" }
FUNCTION = ( "COS"| "SIN" | "TAN" | "EXP" |"LN" | "LOG" "(" FACTOR ")" )
FACTOR = NUMBER|VARIABLE| ( "(" ["-"] EXPRESSION ")" )
VARIABLE = ALPHABHET,{ALPHABHET}
NUMBER= DIGIT,{["."] DIGIT}
DIGIT = 0|1|..|9
ALPHABHET = a|b|..|z|A|B|..|Z

The technique used to transform the grammar to code is called recursive descent parsing (RDP), which is a form of top down parsing
built from a set of functions that are written in terms of each other. With recursion in mind, I created a function for all Non Terminal
identifiers that focuses on only aspects of their defined grammar.

Below are the main methods used to represent the grammar.

ISymbol Expression()
Symbol Term()
Symbol FunctionEvaluator()
Symbol Factor()

Utilizing the code discussed underneath, I created an example mathematical expression parser. As shown by the image below:

## Using the Code

The first thing that came to mind while thinking of how to implement the expression parser was how I wanted the parser to be used.
I figured that the user of the `Evaluator`

class would want to be able to set and get the `Expression`

to parse,
the user would want to get the floating point `Result`

if the Expression could be evaluated. If it can not be evaluated (that is if it contains
undefined variables), the user would want to get a processed string result. With that in mind, I created the `IExpression`

interface. I then created
the `ISymbol`

interface for the intermediate types of information I would need as I processed each symbol. For numbers, the `FloatingValue`

variable is used;
for variables, operations, and functions `StringValue`

is used. To differentiate between all variables, operations, functions, and parenthesis,
I created the `SymbolType`

enumeration (in the Expression calculator example
attached, the symbol type helped with implementing situations such as "-1", "1--1", "1*-1" ).

public interface IExpression
{
string Expression { get; set; }
double Result { get; set; }
bool CanEvaluate { get; set; }
string StrResult { get; set; }
}
public enum SymbolType {expression, operation, function, variable, digit,brackets,endTag}
public interface ISymbol:IExpression
{
SymbolType SymbolType { get; set; }
double FloatingValue { get; set; }
string StringValue { get; set; }
}

The `Symbol`

class implements the `ISymbol`

interface and includes the method `TypedClone()`

which implements a strongly typed member-wise clone of the
`Symbol`

object itself. The clone is needed to maintain the original expression passed in. The extension method in the static class `ExtensionLib`

adds a means to clone an `IExpression`

type to a `Symbol`

type.

public class Symbol:ISymbol
{
public string Expression { get; set; }
public double Result { get; set; }
public bool CanEvaluate { get; set; }
public string StrResul { get; set; }
public SymbolType SymbolType { get; set; }
public double FloatingValue { get; set; }
public string StringValue { get; set; }
public Symbol TypedClone()
{
return this.MemberwiseClone() as Symbol;
}
}
public static class ExtensionLib
{
public static Symbol TypedClone(this IExpression v)
{
return ((Symbol) v).TypedClone();
}
}

The field _`currentPositionCnt`

maintains the position in the `Expression`

string indicating where to continue parsing from.
_`processVariableExpression`

is the flag that allows a check to be made to see if a `Variable`

can be mapped to a value. The _`lastReadSymbol`

holds
the last `Symbol`

parsed that is yet to be consumed. The _`expression`

object holds a copy of the original object that can then be modified.
The _`originalExpression`

object holds the original object with no modifications made to it. _`listOfFunctions`

holds the list of supported functions.
_`listOfBinaryOperators`

holds the operators that take the two parameters in performing calculations. _`listOfUnaryOperators`

holds the operators
that act on a single parameter (or `Expression`

) and the `VariableMapper`

dictionary holds a mapping of Variables to Values.

private int _currentPositionCnt;
private bool _processVariableExpression;
private Symbol _lastReadSymbol;
private IExpression _expression;
private IExpression _originalExpression;
private readonly List<string> _listOfFunctions = new List<string>();
private readonly List<string> _listOfBinaryOperators = new List<string>();
private readonly List<string> _listOfUnaryOperators = new List<string>();
public readonly Dictionary<string, double> VariableMapper { get; private set; }

The constructor initializes the _`listOfFunctions`

, _`listOfBinaryOperators`

, _`listOfUnaryOperators`

list
and `VariableMapper`

dictionary with their respective values (functions, operators, and a dictionary of variables to values).

public Evaluator()
{
_listOfFunctions.AddRange(new [] { "COS", "SIN", "TAN", "EXP", "LN", "LOG" });
_listOfBinaryOperators.AddRange(new [] { "*", "/", "^", "%", "+", "-" });
_listOfUnaryOperators.Add("!");
VariableMapper = new Dictionary<string, double> {{"PI", Math.PI}};
}

The `Resolve()`

method takes in an object which adheres to the `IExpression`

interface, and a boolean argument which tells us we should evaluate mappable variables.
The method trims out the space in between characters, it also factors expressions with variables to make for a compact calculation the next go around when we actually match variables
with values. The latter portion can be removed and the `Factor()`

method updated to always just check if a `Variable`

can be mapped to a value.

public IExpression Resolve(IExpression exp, bool resolveExp=false)
{
_currentPositionCnt = 0;
_originalExpression = exp;
_expression = exp.TypedClone();
_expression.Expression = _expression.Expression.Replace(" ", "");
_expression.Expression += "#";
var result = Expression();
if (!NextIs("#"))
throw new Exception
("An error occurred while procession the expression [ "
+ _originalExpression.Expression + " ] At position "
+ _currentPositionCnt.ToString());
if (resolveExp && !result.CanEvaluate && VariableMapper.Count > 0)
{
_processVariableExpression = true;
_lastSymbolType = SymbolType.expression;
_currentPositionCnt = 0;
_lastReadSymbol = null;
_expression.Expression = result.StrResult.Replace(" ", "");
result = Expression();
}
return result;
}

The `ResolveAsString()`

method takes in an object which adheres to the `IExpression`

interface, and a boolean argument which tells us we should evaluate
mappable variables. The method returns to the caller a string result of either a floating point value or a string expression.

public string ResolveAsString(IExpression exp, bool resolveExp=false)
{
var result = Resolve(exp, resolveExp);
return result.CanEvaluate ? result.Result.ToString() : result.StrResult;
}

The `Expression()`

method calls the `Term()`

method. The resultant from the call can either be a value that can
be evaluated, a *Variable*, or an *Expression*. A check is then made to see if the next following `Symbol`

retrieved
via the `NextIs()`

method call is an operator such as "`+`

", "`-`

",
in which case we call `Term()`

again to get a second parameter needed for evaluation. Else we return with the X `Symbol`

.

private ISymbol Expression()
{
var x = Term();
if (x == null) return null;
for (; ; )
{
if(NextIs("+"))
{
var y = Term();
if (y.CanEvaluate && x.CanEvaluate)
x.Result = x.FloatingValue = x.FloatingValue + y.FloatingValue;
else
{
x.StringValue =
"("+(x.CanEvaluate ? x.FloatingValue.ToString()
: x.StringValue) +
" + " + (y.CanEvaluate ? y.FloatingValue.ToString()
: y.StringValue) + ")";
x.CanEvaluate = false;
}
}
else if (NextIs("-"))
{
var y = Term();
if (y.CanEvaluate && x.CanEvaluate)
x.Result = x.FloatingValue = x.FloatingValue - y.FloatingValue;
else
{
x.StringValue = "("+(x.CanEvaluate ? x.FloatingValue.ToString()
: x.StringValue) +
" - "+ (y.CanEvaluate ? y.FloatingValue.ToString()
: y.StringValue) + ")";
x.CanEvaluate = false;
}
}
else
{
break;
}
}
x.Result = x.FloatingValue;
x.StrResult = x.StringValue;
return x;
}

The `Term()`

method calls the `Factor()`

method. The resultant can either be a value that can be evaluated, a variable, or an expression.
A check is then made to see if the next following `Symbol`

retrieved via the `NextIs()`

method call is an operator such as "`*`

",
"`/`

", "`%`

", "`^`

", "`!`

" or a *Function*, else we return with the X symbol.
If it is one of the above operators, we call the `Factor()`

method again to get the next `Symbol`

(which again can be a value, a variable, or an expression).
Note the factorial case does not call the `Factor()`

method again. Also worthy of pointing out is the fact that operator
precedence are all equal within this method. To raise precedence of any operation, parenthesis would need to be used, to encompass that portion of the expression.

private Symbol Term()
{
var x = Factor();
if (x == null) return null;
for (; ; )
{
if (NextIs("*"))
{
var y = Factor();
if (y.CanEvaluate && x.CanEvaluate)
x.Result = x.FloatingValue = x.FloatingValue * y.FloatingValue;
else
{
x.StringValue = "("+(x.CanEvaluate? x.FloatingValue.ToString()
: x.StringValue) +
" * " + (y.CanEvaluate ? y.FloatingValue.ToString()
: y.StringValue) + ")";
x.CanEvaluate = false;
}
}
else if (NextIs("/"))
{
var y = Factor();
if (y.CanEvaluate && x.CanEvaluate)
x.Result = x.FloatingValue = x.FloatingValue / y.FloatingValue;
else
{
x.StringValue = "("+(x.CanEvaluate ? x.FloatingValue.ToString()
: x.StringValue) +
" / " + (y.CanEvaluate ? y.FloatingValue.ToString()
: y.StringValue) + ")";
x.CanEvaluate = false;
}
}
else if (NextIs("%"))
{
var y = Factor();
if (y.CanEvaluate && x.CanEvaluate)
{
x.Result = x.FloatingValue = x.FloatingValue % y.FloatingValue;
}
else
{
x.StringValue = "("+(x.CanEvaluate ? x.FloatingValue.ToString()
: x.StringValue) +
" % " + (y.CanEvaluate ? y.FloatingValue.ToString()
: y.StringValue) + ")";
x.CanEvaluate = false;
}
}
else if (NextIs("^"))
{
var y = Factor();
if (y.CanEvaluate && x.CanEvaluate)
{
x.Result = x.FloatingValue = Math.Pow(x.FloatingValue , y.FloatingValue);
}
else
{
x.StringValue = "( (" + (x.CanEvaluate ? x.FloatingValue.ToString()
: x.StringValue) +
")^ " + (y.CanEvaluate ? y.FloatingValue.ToString()
: y.StringValue) + ")";
x.CanEvaluate = false;
}
}
else if (NextIs("!"))
{
if (x.CanEvaluate)
{
x.Result = x.FloatingValue = Factorial(x.FloatingValue);
}
else
{
x.CanEvaluate = false;
x.StringValue = "(" + (x.CanEvaluate ? x.FloatingValue.ToString()
: x.StringValue) +")! ";
}
}
else if (_listOfFunctions.Contains(x.StringValue))
{
x = FunctionEvaluator(x.StringValue);
}
else
{
break;
}
}
return x;
}

The `Factor()`

method calls the `Next()`

method to get the next `Symbol`

. If the `Symbol`

can be evaluated, it is immediately
returned else we check if the `Symbol`

is a parenthesis. If the `Symbol`

is a parenthesis, we go on to call the expression again.
Note the `PreviewNextifTrueMoveTrackerByOne()`

method. This is used to handle the case where a negative sign follows the opening parenthesis.
If we can process Variables, we look up the variable names in the `VariableMapper`

dictionary and get the corresponding value. Note: one of the things
I would have loved to have implemented here is, the ability for Variables to point to Variables.

private Symbol Factor()
{
var x = Next();
_lastReadSymbol = null;
if (x != null && x.CanEvaluate)
return x;
if (x != null)
{
if (x.StringValue == "(")
{
bool isunaryMinus = PreviewNextifTrueMoveTrackerByOne("-");
x = Expression() as Symbol;
if (x == null || !NextIs(")"))
{
throw new
Exception("Expression needs to end with a parenthesis "
+ "[ " + _originalExpression.Expression
+ " ] Error at position "
+ _currentPositionCnt.ToString());
}
if (isunaryMinus)
{
if (x.CanEvaluate)
x.FloatingValue *= -1;
else
x.StringValue = "((-1) * " + x.StringValue+")";
}
}
if(_processVariableExpression && x.SymbolType == SymbolType.variable)
{
if (VariableMapper.ContainsKey(x.StringValue))
{
x.FloatingValue = VariableMapper[x.StringValue];
x.StringValue = string.Empty;
x.CanEvaluate = true;
}
}
}
_lastReadSymbol = null;
return x;
}

The `FunctionEvaluator()`

method calls the `Factor()`

method to get the expected value, variable, or expression that is the argument passed into
the function. If the `Symbol`

returned by the `Factor()`

method can be evaluated, the appropriate Math function is called, else it is presumed
that the argument composition includes Variables and a string combination of the supposed Math function and the argument is composed.

public Symbol FunctionEvaluator(string value)
{
Symbol x;
if (NextIs("("))
{
_lastReadSymbol = new Symbol
{
SymbolType = SymbolType.brackets,
StringValue = "("
};
x = Factor();
}
else
{
throw new Exception(
"Function must be imediately followed by parenthesis. Error : ["
+ _originalExpression.Expression
+ " ] At position "
+ _currentPositionCnt.ToString());
}
if (x != null && x.CanEvaluate)
{
switch(value.ToLower())
{
case "sin":
{
x.FloatingValue = Math.Sin(Math.PI/180 * x.FloatingValue);
break;
}
case "cos":
{
x.FloatingValue = Math.Cos(Math.PI / 180 * x.FloatingValue);
break;
}
case "tan":
{
x.FloatingValue = Math.Tan(Math.PI / 180 * x.FloatingValue);
break;
}
case "exp":
{
x.FloatingValue = Math.Exp(x.FloatingValue);
break;
}
case "ln":
{
x.FloatingValue = Math.Log(x.FloatingValue, Math.E);
break;
}
case "log":
{
x.FloatingValue = Math.Log10(x.FloatingValue);
break;
}
}
}
else if (x != null && !x.CanEvaluate)
{
x.StringValue = "( " + value + "( " + x.StringValue + " ))";
}
return x;
}

The `Next()`

method is what handles the parsing of the string characters to symbols. It deals with the separation of string letters to either
Functions or Variables. Note: you will see that the restriction of Variables to only string characters is enforced by how
we parse for Variables and Integers. The beginning `If`

statement is put in place so as to avoid advancing to the next
`Symbol`

if the previous set `Symbol`

has yet to be claimed.

private Symbol Next()
{
if (_lastReadSymbol != null)
{
var sym = _lastReadSymbol;
_lastReadSymbol = null;
return sym;
}
if (_expression.Expression != null &&
_expression.Expression.Length > _currentPositionCnt)
{
var length = _expression.Expression.Length;
if (IsNumber())
{
var val = _expression
.Expression[_currentPositionCnt]
.ToString();
_currentPositionCnt++;
while (_currentPositionCnt < length && IsNumber())
{
val += _expression.Expression[_currentPositionCnt++];
}
_lastReadSymbol =
new Symbol
{
SymbolType = SymbolType.digit,
FloatingValue = double.Parse(val),
CanEvaluate = true
};
return _lastReadSymbol;
}
if (IsString())
{
var val = _expression
.Expression[_currentPositionCnt]
.ToString();
_currentPositionCnt++;
while (_currentPositionCnt < length && IsString())
{
val += _expression.Expression[_currentPositionCnt++];
}
if (_listOfFunctions.Contains(val))
{
_lastReadSymbol =
new Symbol
{
SymbolType = SymbolType.function,
StringValue = val
};
return _lastReadSymbol;
}
_lastReadSymbol =
new Symbol
{
SymbolType = SymbolType.variable,
StringValue = val ,
CanEvaluate = false
};
return _lastReadSymbol;
}
if ((_expression.Expression[_currentPositionCnt] == '.'
&& (_expression.Expression[_currentPositionCnt + 1] < '0'
&& _expression.Expression[_currentPositionCnt + 1] > '9')))
{
throw new
Exception("Expression can not have a decimal point"
+" without an accompanying digits."
+" Expression [ "
+ _originalExpression.Expression
+ " ] At position "
+ _currentPositionCnt.ToString());
}
if (_expression.Expression[_currentPositionCnt] == '(' ||
_expression.Expression[_currentPositionCnt] == ')')
{
_lastReadSymbol = new Symbol
{
SymbolType = SymbolType.brackets,
StringValue = _expression
.Expression[_currentPositionCnt++]
.ToString()
};
return _lastReadSymbol;
}
if (_listOfBinaryOperators.Contains(_expression
.Expression[_currentPositionCnt]
.ToString()))
{
_lastReadSymbol =
new Symbol
{
SymbolType = SymbolType.operation,
StringValue = _expression
.Expression[_currentPositionCnt++]
.ToString()
};
return _lastReadSymbol;
}
if (_listOfUnaryOperators.Contains(_expression
.Expression[_currentPositionCnt]
.ToString()))
{
_lastReadSymbol =
new Symbol
{
SymbolType = SymbolType.operation,
StringValue = _expression
.Expression[_currentPositionCnt++]
.ToString()
};
return _lastReadSymbol;
}
if (_expression.Expression[_currentPositionCnt] == '#')
{
_lastReadSymbol =
new Symbol
{
SymbolType = SymbolType.endTag,
StringValue = _expression
.Expression[_currentPositionCnt++]
.ToString()
};
return _lastReadSymbol;
}
}
return null;
}

The `Factorial()`

method was written to allow me to add factorial as a function within the expression evaluator.

private double Factorial(double number)
{
var val = (long)number;
double retVal=1;
for (long i =val; i>0; i--)
{
retVal *= i;
}
return retVal;
}

The `NextIs()`

method picks the next `Symbol`

by calling `Next()`

and if its value (`FloatingValue`

for `Symbol`

s that can be evaluated
and `StringValue`

for `Symbol`

s that can not be evaluated) matches the argument passed in, _`lastReadSymbol`

is set to null, to allow advancement
of picking a new `Symbol`

via the `Next()`

method.

private bool NextIs(string val)
{
var t = Next();
if (t != null)
{
if (t.CanEvaluate)
if (t.FloatingValue.ToString() == val)
{
_lastReadSymbol = null;
return true;
}
else
{
_lastReadSymbol = t;
}
else
if (t.StringValue == val)
{
_lastReadSymbol = null;
return true;
}
else
_lastReadSymbol = t;
}
return false;
}

The preview next and advance tracker by one method was created to handle the case where a negative sign follows the start of a
parenthesis "(-" (it is used in the `Factor()`

function).

private bool PreviewNextifTrueMoveTrackerByOne(string val)
{
if (_expression.Expression != null
&& _expression.Expression.Length > _currentPositionCnt)
{
if (_expression.Expression[_currentPositionCnt].ToString() ==val)
{
_currentPositionCnt++;
return true;
}
}
return false;
}

The `IsNumber()`

method is used within the `Next()`

method to check that the current character is a number and not a
letter. The check done works for floating point numbers, by checking that if a dot occurs, the next character would have to be a number.

private bool IsNumber()
{
return ((_expression.Expression[_currentPositionCnt] >= '0'
&& _expression.Expression[_currentPositionCnt] <= '9') ||
(_expression.Expression[_currentPositionCnt] == '.' &&
(_expression.Expression[_currentPositionCnt + 1] >= '0'
&& _expression.Expression[_currentPositionCnt + 1] <= '9')));
}

The `IsString()`

method is used within the `Next()`

method to check that the current character is a letter and not a number.

private bool IsString()
{
return ((_expression.Expression[_currentPositionCnt] >= 'A'
&& _expression.Expression[_currentPositionCnt] <= 'Z') ||
(_expression.Expression[_currentPositionCnt] >= 'a'
&& _expression.Expression[_currentPositionCnt] <= 'z'));
}

## Points of Interest

The example attached is a mathematical expression calculator, it utilizes a variant of the code above to implement a functioning calculator. It allows the user to write
mathematical expressions that can contain variables, and factors these expressions in term of variables. It also allows for mapping of variables to values so as to enable
a complete evaluation of the expression. It allows for the storing of expressions and the recalling of expressions (this functionality goes in a round robin form).
The Define button allows you to define variable mapping.

Some of the fun things that can be done to enhance the example code given is:

- Implement a differential calculator just like the one in the following
CodeProject work: http://www.codeproject.com/Articles/13731/Symbolic-Differentiation
- Change the implementation to use postfix.
- Implement graphing functionality (probably my favorite because it means you will have to handle the variable within variables case).

## History

No revisions were made yet.