Click here to Skip to main content
12,951,034 members (60,367 online)
Click here to Skip to main content
Add your own
alternative version


221 bookmarked
Posted 17 Dec 2012

Building a General Purpose Interpreter, Math Engine and Parser in C#

, 13 Feb 2015 GPL3
Rate this:
Please Sign up or sign in to vote.
An easy to understand and use language interpreter, run time parser and calculation engine from scratch in C#.

October 8, 2014: Math Processor is now Open Source.

Table of Contents

  1. Introduction
  2. Using the Sample Applications
  3. Introduction to Math Processor
  4. Using the Math Processor Library
  5. Understanding the Core Engine

1. Introduction

First, the best thing about this article is that we will try to make everything extremely simple. You are NOT required to be a master of, or even familiar with, the details about parsers, compilers etc.

Before we get into details, let's have a look at a few examples of the commands that the parser/engine we are going to create can execute.

>> primes(1000000)2 3 5 ...The first million prime numbers
>> gcd(24,12, 124, 136)4GCD of the given numbers
>> a = array(50, 10, 50)
>> b = array(60, 40, 50)
>> c = array(80, 60, 10)
>> m = matrix(a, b, c)
>> d = det(m)
-76000Determinant of a matrix
t = vectorin(-16, 0.05, 16);:
x = sin(8*t/5) * cos(t);:
y = sin(8*t/5) * sin(t);:
plot(x, y);:
A polar curve.

You can also create more interesting/advanced graphs like the following Spirograph patterns created by one of MP users (To learn more about Spirograph math and how these patterns were created, have a look at these Spirograph math tutorials/examples):


2. Using the Sample Applications

Update August 28, 2013: This section was written for the old versions. Since a new software with complete documentation became available (top downloads), I deleted the contents of this section so as to make the article easier/smaller for new readers. The old downloads are still there at these links to preserve the history:

3. Introduction to Math Processor

I hope that by this time you have played with the Math Processor application a bit. If not, I recommend that you do so now and try to get familiar with its working.

The Math Processor engine provides a lot more functions and commands than discussed in this article. If you are interested in knowing more about the facilities provided by Math Processor, you can have a look at the Math Processor Documentation.

3.1. The Road Ahead

We will be building an interpreter for math expressions as well as a simple little language MPCL (but still more powerful than you probably think). More on MPCL and what we will be implementing in next section. Before we get into that, here is what we will NOT be doing:

  • Support symbolic math
  • Perform arbitrary precision calculations
  • Implement things the hard way

3.2. Math Processor Command Language (MPCL)

MPCL is a simple loosely typed language the Math Processor parser/calculating engine can interpret at run time to carry out the supported operations. MPCL supports:

  • Basic math expressions containing the common operations like +, -, *, /, % etc.
  • Boolean/logical operations
  • Operations on numbers (including lists of numbers)
  • Operations on Matrices
  • Conditional Processing using the if and if-else constructs
  • Loops (repeat and while)
  • Certain directives to control a few aspects of the processing environment
  • Inbuilt functions to expose the major facilities of the System.Math class of the .Net framework as well as a lot of other operations. For the complete list of supported functions, visit the Math Processor Functions page.
  • User defined functions
  • An extremely simple mechanism to allow programmers to write more useful functions and plug them into the main engine at run time.

4. Using the Math Processor Library

Math Processor provides a unified application storage and communication mechanism within the internal modules as well as for the programming interface through the use of class Token. A Token is capable of storing text, numbers, matrices, Boolean data and a lot more things that MP uses internally. What is contained within a Token depends upon its type, which is determined by a public property TokenType of type TokenType. This property is just an enum defined as under:

public enum TokenType 
    Void, Error, Operator, Bool, Vector, Matrix, Text, Function, UserFunction, Directive, 
    Block, LoopOrCondition, FunctionDefiner, Break 

To the user of Math Processor, not all members of this enum are of interest. A few are for internal use only. We will learn about the important ones later in this text. The different fields defined in class Token are declared as under:

TokenType type;
string name = "";
string data = "";
List<double> vector = new List<double>();
double extra; 

And that's all that we need to support all the kinds of data we want to process. From numbers to Boolean values, from lists to matrices and all that. Amazing! Isn't it? The trick is to have different interpretations of the fields according to the type of the token. So for a Token of type Void, most fields mean nothing. It is just what it is called: Void. For a Boolean type, a 0 in the list would mean false and a 1 would mean true. For a matrix, we will make use of 'extra' to store the number of rows. We will store the actual matrix values in the List<double> field. We will compute the rows and columns for the matrix at run time.

Please note that by stuffing everything inside a Token, we have traded some efficiency (in terms of speed and memory) for simplicity. I believe the trade-off here is worth the cost, as we have avoided the need to create complex data structures. You can devise a more complex model if you wished. For me, life is much simpler without parse tables, expression trees and a thousand look up fields!

Now that we have understood the role as well as form of the class Token it is time to discover how to make use of the Math Processor library. There are two possibilities:

  1. Directly send the raw expressions to the parsing engine, obtain the result and do whatever is needed with it.
  2. Create instances of class Token in code and then use different public methods of the class library to manipulate them.

Both the approaches are quite simple. However, the first one is the preferred way to do things in most situations. In the sample applications we have primarily used the first approach. However, in the Console based sample we have also shown how to use the second method.

The second method is needed only when you want to extend the functionality of the application by providing more functions or when you want to directly work with the already defined functions inside the main class library. We shall see a use of this approach a bit later.

The use of the first method is just as easy as calling the following static method of MathProcessor.Calculator class:

public static Token ProcessCommand (string expression) { ... }

The parameter expression is any text that is a valid MPCL command. For a broad list of what constitutes a valid MPCL command please refer to Section 3.2. The result returned from this method would be a Token of any of the following types:

  • TokenType.Void The operation completed successfully but nothing remarkable was returned to the caller. Note that the operation itself may have produced variables or functions etc. depending on the nature of the operation performed.
  • TokenType.Error The operation resulted in an error.
  • TokenType.Text The resulting Token is just plain text.
  • TokenType.Vector The returned Token contains one or more numeric values.
  • TokenType.Matrix The returned Token is a matrix.
  • TokenType.Bool The returned Token contains Boolean data (true/false).

Another notable fact is that some operations produce intermediate results before the whole block of code is parsed and executed. For example, the following block of MPCL code produces output multiple times by use of the repeat loop (paste it in the demo application to see what it does!):

    a = 1;:
        echo("4 x ", a, "= ", 4*a):
        a = a + 1;:

In order to inform the front-end about the intermediate output, the Calculator class raises events using the following delegate:

public delegate void IntermediateResult(Token result);

Now that we have understood the communication mechanism, let's build a very simple front end for the MP engine. Perform the following steps:

  • Create a new Console application in Visual Studio.
  • Add a reference to "MathProcessor.dll" (available with downloads of this article)
  • Modify your Program class as under:
    class Program
        static void Main(string[] args)
            Calculator.IntermediateResultProduced += new IntermediateResult(IntermediateResultProduced);
            string input = "";
            Console.ForegroundColor = ConsoleColor.White;
            Console.WriteLine("Enter MP expressions to process. Enter 'exit' to quit:");
            Console.ForegroundColor = ConsoleColor.Gray;
            while (true)
                input = Console.ReadLine();
                if (input.Trim().ToLower() == "exit")
                Token result = Calculator.ProcessCommand(input);
        static void IntermediateResultProduced(Token result)
        static private void DisplayResult(Token result)
            if (result.TokenType == TokenType.Error)
                Console.ForegroundColor = ConsoleColor.Red;
                Console.WriteLine(">> " + result.GetString());
                Console.ForegroundColor = ConsoleColor.Gray;
            else if (result.TokenType == TokenType.Matrix)
                int rows = (int)result.Extra;
                if (rows > 0)
                    List<double> data = result.Vector.ToList();
                    Token temp;
                    temp = new Token(TokenType.Vector, data.GetRange(0, result.Count / rows).ToArray());
                    Console.WriteLine(">> " + temp.GetString());
                    for (int i = result.Count / rows; i < result.Count; i += result.Count / rows)
                        temp = new Token(TokenType.Vector, data.GetRange(i, result.Count / rows).ToArray());
                        Console.WriteLine("   " + temp.GetString());
            else if (result.TokenType != TokenType.Void)
                Console.WriteLine(">> " + result.GetString());
  • That's it!

After writing this little program, almost all the MP functions are now available to us on the command line. However the function plot will not be available, as it creates a Windows Form and calls its Show() method, which requires a message loop not available to us in this example. The sample application uses a workaround. Please have a look at the downloadable code to see how the workaround is implemented.

4.1. Extending the Core by Adding More Command Line Functions

Another problem we have is the lack of our ability to paste multi-line code blocks in the Console application. Such an operation is not directly available. Let's write a new MP command line function to allow us to 'paste' text content directly from the Clipboard (the sample application contains this functionality). Add a reference to Windows.Forms assembly to the project and put the following method in your Program class:

public static Token Paste(string operation, List<Token> arguments)
        string text = System.Windows.Forms.Clipboard.GetText();
        return Calculator.ProcessCommand(text);
    catch (Exception)
        return new Token(TokenType.Error, "Could not paste text from Clipboard.");

Now modify your Main() method as under:

static void Main(string[] args)
    Function.AddDirective("paste", Paste);
    /* Everthing as before */

Please note the use of [STAThread] Attribute with the Main() method. This is needed for working with the Clipboard. After making the above changes, you will have a new directive called "paste" available to you. You can also make it a function, if you so desired, like this:

Function.AddFunction("paste", Paste);

In MPCL, the difference between a directive and a function is that a function is always followed by a pair of parentheses containing zero or more parameters. Also the parameters to functions are processed before the function is called. For a directive, the parameters are not per-processed. For more detail about directives, please refer to Math Processor Directives page.

Most of the times you will be working with functions. The directives are just a few in number and mostly deal with the configuration of the engine and are not meant for real processing.

5. Understanding the Core Engine

Let's begin with a simple diagram showing the workflow of Math Processor.

The expression from the client is first converted into tokens by the parser/tokenizer. The list of tokens is then converted from Infix to Postfix (Reverse Polish) notation. After that, the tokens are processed by the calculation engine and the result is sent back to the client.

The conversion from Infix to Postfix form is done to facilitate the evaluation phase. Due to the way expressions are represented in Postfix notation, changing the priority of operations does not require use of parentheses, which simplifies the evaluation process. You can see the conversion from Infix to Postfix notation using the postfix directive of Math Processor. Here are a couple of examples:

>> postfix 2*9+3
>> 2 9 * 3 +
>> postfix 2*(9+3)
>> 2 9 3 + *

In the above examples, we can see there is not need to use parentheses in the postfix form, as the priority of operations is inbuilt in the very expressions. In the first example the multiplication takes precedence according to general rules of mathematics. However, in the second example, the Infix form changes the priority by use of parentheses. This changed priority is reflected in the Postfix form through re-ordering of the tokens, which enables us to get rid of the parentheses.

So far so good. But how does the process start and continue and end? Actually it is done in different stages. First, we need to parse the ordinary expression to create the tokens. Then we need to arrange these tokens in a suitable way for processing. After that comes the actual calculation. Finally the result is returned back to the user.

The whole process takes place inside 7 core classes consisting of just a few hundred lines of code. The details of these classes are as under:

5.1. Token

This is the ubiquitous data holder of Math Processor. We already discussed Token in detail earlier in this text in Part-4.

5.2. Tokenizer

This class is responsible for parsing the input and converting it into Token objects. The entire process takes place inside a static function:

public static Token TokenizeString(String expression, out List<token> tokens)

The first parameter 'expression' is the expression to be tokenized. The expression is any valid MPCL command (consisting of math expressions, functions, directives and so on). The second parameter out List<Token> tokens would contain the result of what tokens were found in the input expression. Lastly, the return value will tell about the result of the parsing and can be either of type TokenType.Void in case of success or TokenType.Error in case of erroneous input expression.

5.3. Variables

This class is responsible for storing reserved words and named constants and variables. It defines a small inner class NamedToken to store the named Token objects. A variable can be created using the assignment operator. Some MP functions can also create variables. The name of a variable must start with a letter or underscore and can consist of letters and numbers. Here is how a variable can be created and used:

>> a = 10
>> 10
>> b = a + 100
>> 110

5.4. Function

This class stores all the functions that can be used using MPCL function calls. Every MP command-line function should be of the form defined by the following delegate:

public delegate Token FunctionDelegate(string operation, List<token> arguments); 

Let's see the whole process in action. MP provides a function named contains, which is used to check if a number is present in a matrix or array of numbers. This is how this function is implemented:

public static Token Contains(string operation, List<Token> arguments)
    if (arguments.Count != 2)
        return Token.Error("Exactly two arguments expected");

    if ((arguments[0].TokenType != TokenType.Matrix && arguments[0].TokenType != TokenType.Vector) ||
        (arguments[1].TokenType != TokenType.Matrix && arguments[1].TokenType != TokenType.Vector) ||
         arguments[1].Count != 1)
        return Token.Error("First argument should be an array or matrix and second a single numeric value");

    if (arguments[0].Vector.Contains(arguments[1].FirstValue))
        return new Token(TokenType.Bool, 1, 1);
        return new Token(TokenType.Bool, 1, 0);

Let's now see how we can use this function:

>> a = array(1,2,3,4);
>> contains(a, 1)
>> true
>> contains(a, 5)
>> false
>> if (contains(a, 2)) { echo("Number found in the array."): }
>> Number found in the array.

5.5. FunctionDefiner

This class is responsible for creating, managing and executing User Defined Functions, which are a very important means to allow the users to create small amounts of code to extend the functionality of the engine at run time. User defined functions are not really efficient due to their interpreted nature. However, they are still useful for doing small chunks of work you need to perform repeatedly.

We already saw a very interesting example of a User Defined Function earlier in this text when we created some very interesting curves (Spirograph curves). To get more details and see other examples of User Defined functions, you can refer to the User Defined Functions page of the online MP documentation.

5.6. BlockCommands

This class is responsible for managing and executing the if/if-else, while and repeat constructs. The reason that these constructs are put together is their common syntactical form:

key_word(expression) { one or more statements }

For an explanation of each of these see the Conditional Statements and Loops pages.

5.7. Calculator

We already saw the Calculator.ProcessCommand() method in action, which was used to send all the expressions to the engine for processing. Besides providing that facility, this class also contains all the code to carry out basic arithmetic calculations involving the supported operators like +, -, * etc. It also handles the conversion from Infix to Postfix notation, determines the priority of operators and applies these operators to operands to obtain the results.


Finally, we have reached the end. I hope you found the article, the main engine and the sample applications helpful. I also hope I was able to deliver my promise of creating a simple parsing library and calculation engine for you to use in your own projects.


This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)


About the Author

Technical Lead
Pakistan Pakistan
I am a full-stack developer. My skills include JavaScript, C#/.Net, MS Azure cloud etc. I love to work on complex programming tasks requiring deep analysis, planning and use of efficient algorithms and data structures.

You may also be interested in...


Comments and Discussions

QuestionCould you clarify the relationship between your math processor and the equation editor? Pin
Karl 200017-Mar-15 8:05
memberKarl 200017-Mar-15 8:05 
AnswerRe: Could you clarify the relationship between your math processor and the equation editor? Pin
Kashif_Imran17-Mar-15 9:08
memberKashif_Imran17-Mar-15 9:08 
GeneralMy vote of 5 Pin
Gun Gun Febrianza5-Nov-14 1:55
member Gun Gun Febrianza5-Nov-14 1:55 
GeneralRe: My vote of 5 Pin
Kashif_Imran5-Nov-14 3:41
memberKashif_Imran5-Nov-14 3:41 
GeneralWow, nice job! Pin
Cryptonite21-Oct-14 8:22
memberCryptonite21-Oct-14 8:22 
GeneralRe: Wow, nice job! Pin
Kashif_Imran21-Oct-14 9:52
memberKashif_Imran21-Oct-14 9:52 
GeneralMy vote of 5 Pin
Bruno Sprecher18-Oct-14 1:29
memberBruno Sprecher18-Oct-14 1:29 
GeneralRe: My vote of 5 Pin
Kashif_Imran18-Oct-14 14:49
memberKashif_Imran18-Oct-14 14:49 
QuestionPraise Pin
JimboReynolds16-Jul-14 13:48
memberJimboReynolds16-Jul-14 13:48 
AnswerRe: Praise Pin
Kashif_Imran16-Jul-14 14:04
memberKashif_Imran16-Jul-14 14:04 
QuestionAlways thank you. Pin
WuRunZhe18-Nov-13 14:36
memberWuRunZhe18-Nov-13 14:36 
AnswerRe: Always thank you. Pin
Kashif_Imran18-Nov-13 14:53
memberKashif_Imran18-Nov-13 14:53 
QuestionDownload (a complete software that uses the new class library) Pin
rpraver3-Sep-13 8:45
memberrpraver3-Sep-13 8:45 
AnswerRe: Download (a complete software that uses the new class library) Pin
Kashif_Imran3-Sep-13 19:18
memberKashif_Imran3-Sep-13 19:18 
AnswerRe: Download (a complete software that uses the new class library) Pin
Kashif_Imran15-Nov-13 4:48
memberKashif_Imran15-Nov-13 4:48 
QuestionBug with using double.Parse() Pin
Member 376142129-Aug-13 7:23
memberMember 376142129-Aug-13 7:23 
AnswerRe: Bug with using double.Parse() Pin
Kashif_Imran29-Aug-13 7:32
memberKashif_Imran29-Aug-13 7:32 
AnswerRe: Bug with using double.Parse() Pin
PieterDeRycke2-Jan-14 8:51
memberPieterDeRycke2-Jan-14 8:51 
GeneralRe: Bug with using double.Parse() Pin
Jrgen Schildmann17-Feb-15 10:10
memberJrgen Schildmann17-Feb-15 10:10 
GeneralWell done Pin
Espen Harlinn28-Aug-13 20:16
mvpEspen Harlinn28-Aug-13 20:16 
GeneralRe: Well done Pin
Kashif_Imran28-Aug-13 23:05
memberKashif_Imran28-Aug-13 23:05 
QuestionA question Pin
Mizan Rahman29-Jul-13 22:25
memberMizan Rahman29-Jul-13 22:25 
AnswerRe: A question Pin
Kashif_Imran30-Jul-13 0:01
memberKashif_Imran30-Jul-13 0:01 
QuestionCongratulations! And a little tip Pin
victorbos15-Jan-13 13:34
membervictorbos15-Jan-13 13:34 
AnswerRe: Congratulations! And a little tip Pin
Kashif_Imran15-Jan-13 17:38
memberKashif_Imran15-Jan-13 17:38 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170525.1 | Last Updated 13 Feb 2015
Article Copyright 2012 by Kashif_Imran
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid