Introduction
This code shows a very simple mathematical expression parser capable of computing the values of expressions like 3*5.3E-2/2. I wrote this code for didactic purposes to show the algorithm's internal working, so I have tried to keep the code as simple and short as possible. That's why I used straight functions and not classes.
To interpret a mathematical expression, simply call the function evalExpression
.
Example:
Result= evalExpression("2*3-5^2");
will put the value -19 into Result
.
The parser understands the following operators: +, -, *, /, ^. The expression is evaluated from left to right with the following priorities:
Operator
| Priority
|
+ -
| 0 (lowest)
|
* /
| 1
|
^ (raise to power)
| 2 (highest)
|
The operators with the highest priority are evaluated first. This ensures that 2+3*4^2+5 is interpreted correctly as 2+(3*(4^2))+3 and not (((2+3)*4)^2)+5.
To keep the code as simple as possible, the handling of parenthesis is not implemented.
Parser used to make a calculator
The code to make a simple calculator using the evalExpression
function is elementary: each time the <ENTER> key is pressed, the expression to evaluate is read from the input box, and is added to the result box followed by "=", and the result of the evaluation computed by a call to evalExpression
.
private void InputBox_KeyDown(object sender, KeyEventArgs e)
{
if (e.KeyCode == Keys.Enter)
{
try
{
ResultBox.AppendText(InputBox.Text + "=" +
evalExpression(InputBox.Text) + "\r\n");
InputBox.Text = "";
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);}
}
}
In case of problems, an exception is thrown by the evalExpression
function.
How does the parser work (internally)
The mathematical expressions we parse have the following grammar:
- expressionLevel0::= expressionLevel1 (('+'|'-') expressionLevel1)*
- expressionLevel1::= expressionLevel2 (('*'|'/') expressionLevel2)*
- expressionLevel2::= expressionLevel3 (('^') expressionLevel3)*
- expressionLevel3::= value
- value::= '+'|'-'|'' [0-9]([0-9])* (.([0-9])*) E('+'|'-'|'')[0-9]+
Where | indicates OR (example, '+' | '-' means plus or minus), and * means repeat the expression zero, once, or many times.
The expression specified by the user is a level 0 expression, i.e., made of a series of expressions in level 1 separated by + or - (e.g., A+B-C+D where A, B, C, D are level 1 expressions). To get the value of the first expression in level 1 within the expression level 0 and the position of the first + or - operator, we call the level 1 expression evaluator.
This will result in the value of the first level 1 expression (i.e., the left argument of the first addition/subtraction), and the position where the '+'or '-' should be is returned in CurrentPos
. We the extract this operator:
char Operation = expression[currentPos];
and ask for the addition (subtraction) right side argument.
expression = expression.Substring(currentPos + 1);
double value2 = evalExpression(expression, 1, out currentPos);
Finally, we perform the addition (or subtraction):
switch (Operation)
{
case '+':
result += value2;
break;
...
And, repeat this process till the end of the string.
The evaluation of level 1 and level 2 expressions (i.e., multiply/divide, and raise to power) is totally identical to the evaluation of level 0 expressions; that's why, instead of implementing three functions evalExpressionLvl1
, evalExpressionLvL2
, evalExpressionLvl3
, I have written one function evalExpression(expression,level,out currentPos)
, which according to the value of argument level
handles the operators '+', '-' or '*', '/', or '^'.
Since there are no level three operators, level
=3 this means that we are not handling expressions anymore, but values (i.e., numbers), and consequently, the function double value(string expression, out Int32 currentPos)
must be called.
The function value
simply returns the value (in a double
format) of the left part of the string 'expression'. This is done by searching for the first char not part of the number (typically an operator) and calling the .NET function double.parse
.
Detailed call trace
To make things clearer, this is a detailed description of the function calls of the evaluation of "2*30+4":
Step 1: Call evalExpression
at level 0.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
Step 2: Call evalExpression
at level 1.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 1, currentPos = 0)
Step 3: Go on with level 2 and level 3.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 1, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 2, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 3, currentPos = 0)
Step 4: Level 3 makes a call to the value
function.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 1, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 2, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 3, currentPos = 0)
->value(expression = "2*30+4", currentPos = 0)
Step 5: value
returns 2 and advances the current Pos
to 1 to points to a * which is an operator out of scope for level 2, so we go up to level 1 and call level 2 again, but this time for the value on the right side of the * operator.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 1, currentPos = 0)
->evalExpression(expression = "30+4", level = 2, currentPos = 0)
Step 6: As before, we go down to the call to value
.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
->evalExpression(expression = "2*30+4",level = 1, currentPos = 0)
->evalExpression(expression = "30+4", level = 2, currentPos = 0)
->evalExpression(expression = "30+4",level = 3, currentPos = 0)
->value(expression = "30+4", currentPos = 0)
Step 7: value
returns 30 and we set currenPos
to 2 (i.e., point to the "+"). This value 30 is then in cascade returned till level 1 where the multiplication is done. Since "+" is a level zero operator. Level 1 exits and returns the value 60 (i.e., 2*30), and we set currentPos
to 4 (i.e., 1 (for the 2) + 1 (for the operator *) + 2 (for the 30)).
Step 8: Level 0 calls once again all the following levels to get the addition right side operand.
evalExpression(expression = "2*30+4",level = 0, currentPos = 0)
->evalExpression(expression = "4",level = 1, currentPos = 0)
->evalExpression(expression = "4", level = 2, currentPos = 0)
->evalExpression(expression = "4",level = 3, currentPos = 0)
->value(expression = "4", currentPos = 0)
Step 9: We have reached the end of the string. value
returns 4, Level 3, Level 2, level1
returns 4, Level 0 performs the final addition and returns the final value: 64.
Conclusion
The code presented here is very simple, but I hope it will help you to understand how the parser works. Now, as an exercise, you can modify it to make it handle the parenthesis correctly.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.