Click here to Skip to main content
15,879,239 members
Articles / Programming Languages / C#

General Expression Parser and Evaluator

Rate me:
Please Sign up or sign in to vote.
4.10/5 (12 votes)
26 Jun 2011CPOL4 min read 69.1K   1.8K   46   19
A user configurable expression parser and evaluator

Introduction

Most of the expression parsers I have seen are designed to parse and evaluate mathematical expressions. While I want to parse math expressions, I also wanted the ability to handle different types of expressions. I wanted a parser that would return a list of the base elements in an expression and let me apply my own evaluator to that list.

Using the Code

In order to parse an expression, a user supplied tokenizer is required. The tokenizer provides two functions:

  1. It parses the expression into its base elements. These elements are of one of the following types:
    • Literal - A string encoded in double or single quote marks
    • Identifier - A variable name to be resolved by the evaluator
    • Constant - An numeric or hex constant
    • Function - The name of a function to be invoked by the evaluator
    • Argument - The function argument delimiter, i.e. a comma
    • GroupStart - i.e. a left perenthesis
    • GroupEnd - i.e. a closing right perenthesis
    • Operator - An expression operator
    • CondTrue - Marks the start of elements to be evaluated when a condition is true.
      The CondTrue and CondFalse operators come from conditional expressions, i.e. (a>b ? a : b)
    • CondFalse - Marks the start of elements to be evaluated when a condition is false
    • Assignment - An assignment operator

    The tokenizer is free to determine which type an element belongs to. The tokenizer also assigns the precedence to the operators. It also assigns the association of the operator. A operator is either Left or Right associative. Most operators are Left associative but the exponent, "**", and conditional, "?", operators are right associative.

  2. It provides a function to create an operator from a token element. An operator falls into one of the following groups:
    • Arithmetic - A mathematical operator, i.e. +, - or *
    • Comparison - A comparison operator, i.e. ==, > or <
    • Logical - A logical operator, i.e. && (AND) or || (OR)
    • Bitwise - A bitwise operator, i.e. & (AND), | (OR) or ^ (XOR)
    • Conditional - True or False element of a conditional expression, i.e. (a>b ? ... : ...)
    • Assignment - Assign result to a variable, i.e. =, +=, or -+

Once the expression has been tokenized, an RPN list elements is created using the "Shunting Yard" algorithm.

Below are some examples of expressions and the parsing results using the supplied "MathParser" as the tokenizer.

x = a+b*c         -a+(b+c)*2       a>b ? a-b : a+b    a+min(x, y)
0: Identifier x   0: Identifier a  0: Identifier a    0: Identifier a
1: Identifier a   1: Operator -    1: Identifier b    1: Identifier x
2: Identifier b   2: Identifier b  2: Operator >      2: Identifier y
3: Identifier c   3: Identifier c  3: CondTrue ? 8    3: Function min 2
4: Operator *     4: Operator +    4: Identifier a    4: Operator +
5: Operator +     5: Constant 2    5: Identifier b
6: Assignment =   6: Operator *    6: Operator -
                  7: Operator +    7: CondFalse : 11
                                   8: Identifier a
                                   9: Identifier b
                                  10: Operator +

What is shown above is the result of applying the "ToString" method to the elements returned by the parser. The "Identifier", "Constant" and "Operator" items are self explanatory. The "CondTrue" and CondFalse" elements act as an "If, Else" construct. The number on the "CondTrue" element is the index of the first element to be evaluated if the condition is false. The number of the "CondFalse" element is the index of the first element following the conditional expression. The "CondFalse" element in effect acts as a "GoTo" the element specified. The number on the "Function" element is the number of arguments to be passed to the named function.

The list of elements in an expression is generated by calling the parser. For example, from the test application:

C#
. . .
ITokeniser parser = new MathParser();
List<RpnElement> tokens = Parser.ParseExpression(parser,rpnExpression.Text);
. . .

Evaluating the Expression

The act of evaluating an expression involves the following steps:

  1. If the next element on the expression list is a "Literal" or "Constant" then push its value onto the operand stack.
  2. If the next element is an "Identifier" then get the current value of the identifier and push it on to the stack.
  3. If the next element is a function or an operator, then pop the required function arguments or the Left and Right sides of the operator from the stack. Then evaluate the function or the operator. Push the result onto the operand stack.
  4. Repeat until all elements in the expression list have been processed.
  5. When all elements have been processed, there should be one operand left on the operand stack, this is the result of the expression.

Below is the main loop of the sample evaluator included with the test application:

C#
private RpnParser.RpnOperand RpnEval(List<RpnElement> tokens) {
  Stack<RpnOperand> oprndStack = new Stack<RpnOperand>();
  RpnOperand oprnd = null;
  for (int nextElem = 0; nextElem < tokens.Count; ++nextElem) {
    RpnElement token = tokens[nextElem];
    switch (token.ElementType) {
      // create an operand from the token and push onto the operand stack
      case ElementType.Literal:
      case ElementType.Constant:
        oprnd = new RpnOperand((RpnToken)token);
        break;
      // Get the current value of the identifier and
      // push it onto the operand stack
      case ElementType.Identifier:
        oprnd = EvalIdentifier((RpnToken)token);
        break;
      // Pop the left and right sides of the operator from operand stack
      // Perform the operation and push the results onto the operand stack
      case ElementType.Operator:
        RpnOperator oper = (RpnOperator)token;
        RpnOperand lhs, rhs;
        lhs = rhs = null;
        if (oper.IsMonadic && oprndStack.Count >= 1) {
          lhs = new RpnOperand(0);
          rhs = oprndStack.Pop();
        } else
          if (oprndStack.Count >= 2) {
            rhs = oprndStack.Pop();
            lhs = oprndStack.Pop();
          }
        if (lhs == null)
          StackError("operator", token.StrValue);
        else
          oprnd = EvalOperator(oper, lhs, rhs);
        break;
      // Determine if True or False condition has be met
      // Set element index to the next element to be processed
      case ElementType.CondTrue:
      case ElementType.CondFalse:
        oper = (RpnOperator)token;
        if (oprndStack.Count < 1)
          StackError("operator", token.StrValue);
        if (oper.OpType == OperatorType.CondTrue) {
          lhs = oprndStack.Pop();
          if (lhs.NumValue == 0)
            nextElem = oper.CondGoto - 1;  // for loop will add one
        } else
          nextElem = oper.CondGoto - 1; // for loop will add one
        continue;
      // Pop the function arguments from the operand stack
      // and place into an array
      // Evaluate the function and push the result onto operand stack
      case ElementType.Function:
        RpnFunction func = (RpnFunction)token;
        if (oprndStack.Count >= func.ArgCount) {
          RpnOperand[] args = new RpnOperand[func.ArgCount];
          for (int i = func.ArgCount - 1; i >= 0; --i)
            args[i] = oprndStack.Pop();
          if (evalCBox.Checked)
            oprnd = EvalFunction(func, args);
        } else
          StackError("function", token.StrValue);
        break;
      // Pop Identifier and value to be assigned from the stack
      // Push the resulting operand onto the operand stack
      case RpnParser.ElementType.Assignment:
        if (oprndStack.Count > 1) {
          rhs = oprndStack.Pop();   // value to be assigned
          lhs = oprndStack.Pop();   // identifier to be assigned to
          oprnd = AssignOperator((RpnOperator)token, lhs, rhs);
        } else
          StackError("assignment", token.StrValue);
        break;

    }
    oprndStack.Push(oprnd);
  }
  if (oprndStack.Count != 1)
    StackError("result", "");
  return oprndStack.Pop();
}

private void StackError(string token, string value) {
  string text = String.Format("Insufficient operands on stack for {0}: {1}",
     token, value);
  throw new ApplicationException(text);
}

Summary

Hopefully this article demonstrates a generalized expression parser, where the elements that make up an expression and the evaluation of that expression are controlled by the user. The only job of the parser is to present the elements in a standard, consistent manner.

References

History

  • July 9, 2008 - Version 1 released
  • July 11, 2008 - Removed signing requirement from project
  • June 26, 2011 - Updated source code

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
BugBracket Problems Pin
zipfelklatscher31-May-13 4:52
zipfelklatscher31-May-13 4:52 
QuestionString Expression Evaluator Pin
specialdreamsin13-Jun-12 23:39
specialdreamsin13-Jun-12 23:39 
QuestionLogical Operator Precedence Pin
shamlen25-Apr-12 14:04
shamlen25-Apr-12 14:04 
QuestionWhat abt Nested CondFalse or CondTrue Statements ? Pin
Binoy Rajan20-Jun-11 0:35
Binoy Rajan20-Jun-11 0:35 
AnswerRe: What abt Nested CondFalse or CondTrue Statements ? Pin
WBurgMo20-Jun-11 10:35
WBurgMo20-Jun-11 10:35 
GeneralRe: What abt Nested CondFalse or CondTrue Statements ? Pin
Binoy Rajan20-Jun-11 21:15
Binoy Rajan20-Jun-11 21:15 
GeneralRe: What abt Nested CondFalse or CondTrue Statements ? Pin
Binoy Rajan11-Jul-11 6:11
Binoy Rajan11-Jul-11 6:11 
GeneralRe: What abt Nested CondFalse or CondTrue Statements ? Pin
WBurgMo11-Jul-11 11:05
WBurgMo11-Jul-11 11:05 
GeneralRe: What abt Nested CondFalse or CondTrue Statements ? Pin
Binoy Rajan11-Jul-11 17:25
Binoy Rajan11-Jul-11 17:25 
AnswerRe: What abt Nested CondFalse or CondTrue Statements ? Pin
WBurgMo16-Jul-11 10:25
WBurgMo16-Jul-11 10:25 
I have submitted an update to correct this.
I did this on a late on a friday, so it may
take till monday or tuesday for it to show up
on CodeProject.

The updated version should handle statements
like the following:

a==2 ? 12 : ((a==3 ? 34 : 56)+ 2)

James Johnson
GeneralMy vote of 5 Pin
Carlos Andrés Valencia Pérez28-Dec-10 11:09
Carlos Andrés Valencia Pérez28-Dec-10 11:09 
Generalerror detected in logical compare Pin
Michel Bock8-Dec-08 3:11
Michel Bock8-Dec-08 3:11 
Generaltrouble with point in float numbers Pin
pasha-e22-Jul-08 21:22
pasha-e22-Jul-08 21:22 
GeneralRe: trouble with point in float numbers Pin
WBurgMo23-Jul-08 6:25
WBurgMo23-Jul-08 6:25 
GeneralRe: trouble with point in float numbers Pin
pasha-e23-Jul-08 20:52
pasha-e23-Jul-08 20:52 
Generaltrouble with sign Pin
pasha-e22-Jul-08 0:22
pasha-e22-Jul-08 0:22 
GeneralRe: trouble with sign Pin
WBurgMo22-Jul-08 10:40
WBurgMo22-Jul-08 10:40 
GeneralSign for library Pin
Oleksii Prosiankin10-Jul-08 21:26
Oleksii Prosiankin10-Jul-08 21:26 
GeneralRe: Sign for library Pin
WBurgMo11-Jul-08 10:37
WBurgMo11-Jul-08 10:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.