Click here to Skip to main content
15,887,746 members
Articles / Desktop Programming / Win32

Simple Guide to Mathematical Expression Parsing

Rate me:
Please Sign up or sign in to vote.
4.14/5 (25 votes)
18 Jun 2010GPL33 min read 87.5K   1.4K   29   24
In this article, I introduce a very simple way to parse an expression.

This article is dedicated to my love Aida.

Introduction

This article describes a practical mathematical parser - an interactive program that can be used to perform basic arithmetic operations. In fact, in order to keep understanding of this article simple, I prefer to avoid any unnecessary items so you cannot rely on this program as a real calculator and it just works with integers and it can parse an expression with + , - , * , / , ( , ).

When I first started thinking about writing code to calculate a mathematical expression, I soon realized that it wasn't that easy [at least for me]. I had to delve into such concepts as parsing, tokens, lexical analysis, node trees and tables, etc.

The rules of the algorithm were the same ones we learned in algebra. For example: multiply (or divide) before adding or subtracting; start with inner parentheses and work outwards; if a pair of binary operators have equal precedence, perform the left operator of the pair first.

Here is its screen shot:

csharpcalc.jpg

Using the Code

The idea of this calculator or better to say math parser is think in tree and node way. I think the following picture can explain the whole idea:

a+b*c*(d-e)

img004.gif

You can see the algorithm of multiplying (or dividing) before adding or subtracting; starting with inner parentheses and working outwards; if a pair of binary operators have equal precedence, performing the left operator of the pair first.

At first, I made ParseExpr() function. This function + or - two operands with each other but before that checks if there is any * or / after each operand or not so it calls ParseFactor() function which is made to * or / two operands. Of course it checks if there is any expression in a pair of parenthesis or not, so it calls ParseTerm() function to check those expressions between braces, then at this point it reaches the value so we should parse it so then it calls ParseNumber().

For example above at first it calls ParseExpr(Expression) -> ParseFactor(Expression) -> ParseTerm(Expression) -> ParseNumber(Expression) and it returns a and then it cuts the Expression to +b*c*(d-e) so the returned value from ParseNumber goes to op and then it checks the first character to see if it is plus and there it is, so it cuts the Expression to b*c*(d-e) and then it repeats the last actions and makes up the above tree.

Blocks of code:

C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace csharpcalc
{
    class Expression
    {
        public int ParseExpr(ref string expr)
        {
            int op , op1;
            op = ParseFactor(ref expr);
            if (expr.Length != 0 )
            {
                if (expr[0] == '+')
                {
                    expr = expr.Substring(1, expr.Length - 1);
                    op1 = ParseExpr(ref expr);
                    op += op1;
                }
                else if (expr[0] == '-')
                {
                    expr = expr.Substring(1, expr.Length - 1);
                    op1 = ParseExpr(ref expr);
                    op -= op1;
                }
            }
            return op;
        }
        public int ParseFactor(ref string expr)
        {
            int op, op1;
            op = ParseTerm(ref expr);
            if (expr.Length != 0)
            {
                if (expr[0] == '*')
                {
                    expr = expr.Substring(1, expr.Length - 1);
                    op1 = ParseFactor(ref expr);
                    op *= op1;
                }
                else if (expr[0] == '/')
                {
                    expr = expr.Substring(1, expr.Length - 1);
                    op1 = ParseFactor(ref expr);
                    op /= op1;
                }
            }
            return op;
         }
        public int ParseTerm(ref string expr)
        {
            int returnValue = 0;
            if (expr.Length != 0)
            {
                if (char.IsDigit(expr[0]))
                { 
                    returnValue = ParseNumber(ref expr);
                    return returnValue;
                }
                else if (expr[0] == '(')
                {
                    expr = expr.Substring(1, expr.Length - 1);
                    returnValue = ParseExpr(ref expr);
                    return returnValue;
                }
                else if (expr[0] == ')')
                    expr = expr.Substring(1, expr.Length - 1);
            }
            return returnValue;
        }
        public int ParseNumber(ref string expr)
        {
            string numberTemp = "";
            for (int i = 0; i < expr.Length && char.IsDigit(expr[i]); i++)
            {
                if (char.IsDigit(expr[0]))
                {
                    numberTemp += expr[0];
                    expr = expr.Substring(1, expr.Length - 1);
                }
            }
            return int.Parse(numberTemp);
        }
    }
}

If You Want To ...

If you want to complete this project in order to accept double numbers too, you can add if-block in ParseNumber() method that even if char.IsDigit() was false, check if it is "." and if it is so, add it to the numberTemp and then cast it.

If you want to add some other mathematical functions to this project such as sin(), cos(), etc., you can check the first char of Expression in the ParseTerm function to see if it is a letter or not (char.IsLetter) and then call a function like ParseWord() and thereafter splitting the word, you can use a switch case block.

Points of Interest

I am a Control Engineer, but I have always loved computer programming and this simple project brings life to my past dreams.

History

Since I was a 16 year old student, I had a dream of writing a math parser but I have always had some other work to do. Finally I decided to write that. After searching the literature, I came across an outline of an algorithm for parsing arithmetic expressions in Principles of Systems Programming by Robert M. Graham (Wiley, 1975).

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)


Written By
Iran (Islamic Republic of) Iran (Islamic Republic of)
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionError checking Pin
Member 115043477-Mar-15 9:30
Member 115043477-Mar-15 9:30 
With the addition of the fix supplied in the comment section by kayakdog24, the code properly parses correctly-formatted expressions. It fails, though, to detect expressions that have errors, returning an incorrect result without warning. This can be fixed with the following additions:

The first error to handle is a missing right bracket. In these circumstances, kayakdog24's fix will try to remove the expected right bracket from an empty string. This can be fixed by changing kayakdog24's suggested fix from:
C#
else if (expr[0] == '(')
 {
    expr = expr.Substring(1, expr.Length - 1);
    returnValue = ParseExpr(ref expr);
    expr = expr.Substring(1, expr.Length - 1); //removing closing parenthesis 
    return returnValue;
 }

to
C#
else if (expr[0] == '(')
 {
    expr = expr.Substring(1, expr.Length - 1);
    returnValue = ParseExpr(ref expr);
    if (expr.Length != 0)
      {
        expr = expr.Substring(1, expr.Length - 1); //removing closing parenthesis
      }
    else 
      { 
        // generate error message "Missing right bracket"
      }
    return returnValue;
  }

The second error to handle is when the code comes across something it can't interpret (e.g. in the expression 2+P+4 it doesn't know how to handle "P"). In such a case it returns early with an incorrect value. This can be detected because expr still contains the unparsed part. Therefore, in the calling routine, immediately after the call, test the value of expr and, if it's not empty, generate an error message.
C#
theValue = parseExpr(&expr); // <- calling programme calls the parsing routine
if (expr.Length != 0)
  {
    if (expr[0] == ')')
      {
         // generate error message "Unpaired right bracket"
      }
    else
      {
        // generate error message "Can't interpret (remaining expr)"
      }
  }

The third error is when the code gets to the end of the expression while something additional is expected (e.g. in the expression 2*3* a value is expected after the second multiply). In such a case, the final operator (in this case the second multiply) calls the routine recursively with an empty string. This can be handled by placing a check for an empty string at the beginning of the PassTerm routine. Therefore, in ParseTerm, make the first statement:
C#
if (expr.Length == 0)
  {
    // generate error message "Expression ends unexpectedly"
  }

The fourth error is where a second operator is placed next to the preceding operator (e.g. in the expression 2** two multiplies are placed together without a value separating them).
This can be handled if the fix to ParseTerm, above, is enhanced as follows:
C#
if (expr.Length == 0)
  {
    // generate error message Expression ends unexpectedly"
  }
else
  {
    if ( (expr[0] == ")") || (expr[0] == "*") || (expr[0] == "/")
      {
        // generate error message "Missing value before (expr)"
      }
  }


....Jeff

modified 7-Mar-15 15:37pm.

QuestionThanks for the code Pin
Member 115043476-Mar-15 7:41
Member 115043476-Mar-15 7:41 
GeneralMy vote is 4 Pin
Sturms6-Apr-14 7:03
Sturms6-Apr-14 7:03 
QuestionNot working for some expression Pin
Xylph1-Apr-13 2:54
Xylph1-Apr-13 2:54 
GeneralVery clear article, I gave it a 5 Pin
Willie Lassiter21-Nov-12 1:34
Willie Lassiter21-Nov-12 1:34 
GeneralMy vote of 5 Pin
Willie Lassiter21-Nov-12 1:29
Willie Lassiter21-Nov-12 1:29 
GeneralMy vote of 1 Pin
ReDll18-Dec-11 9:25
ReDll18-Dec-11 9:25 
GeneralGood work... Pin
Vivek Narayanan Namboothiri14-Feb-11 17:16
Vivek Narayanan Namboothiri14-Feb-11 17:16 
GeneralMy vote of 5 Pin
kayakdog2429-Jan-11 11:15
kayakdog2429-Jan-11 11:15 
GeneralMy vote of 4 Pin
CPMOliveira22-Jul-10 2:50
CPMOliveira22-Jul-10 2:50 
GeneralMy vote of 2 Pin
KentuckyEnglishman22-Jun-10 2:41
KentuckyEnglishman22-Jun-10 2:41 
GeneralRe: My vote of 2 Pin
Mahyar Etedal23-Jun-10 5:28
Mahyar Etedal23-Jun-10 5:28 
GeneralRe: My vote of 2 [I put 5] Pin
kayakdog2429-Jan-11 11:23
kayakdog2429-Jan-11 11:23 
QuestionWhy ? Pin
Erlend Robaye21-Jun-10 20:41
Erlend Robaye21-Jun-10 20:41 
AnswerRe: Why ? Pin
Mahyar Etedal23-Jun-10 5:22
Mahyar Etedal23-Jun-10 5:22 
AnswerRe: Why and Why Pin
baranils10-Jul-10 21:49
baranils10-Jul-10 21:49 
GeneralMy vote of 2 Pin
Pascal Ganaye20-Jun-10 22:31
Pascal Ganaye20-Jun-10 22:31 
GeneralRe: My vote of 2 Pin
Mahyar Etedal23-Jun-10 4:41
Mahyar Etedal23-Jun-10 4:41 
GeneralExpressions are not about strings Pin
Alois Kraus20-Jun-10 11:43
Alois Kraus20-Jun-10 11:43 
GeneralRe: Expressions are not about strings Pin
Mahyar Etedal23-Jun-10 4:39
Mahyar Etedal23-Jun-10 4:39 
GeneralRe: Expressions are not about strings Pin
baranils10-Jul-10 21:39
baranils10-Jul-10 21:39 
GeneralRe: Expressions are not about strings Pin
Mahyar Etedal11-Jul-10 3:48
Mahyar Etedal11-Jul-10 3:48 
GeneralMy vote of 1 Pin
alijoongolgoli18-Jun-10 17:36
alijoongolgoli18-Jun-10 17:36 
GeneralRe: My vote of 1 Pin
Mahyar Etedal23-Jun-10 4:36
Mahyar Etedal23-Jun-10 4:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.