13,005,991 members (95,517 online)
alternative version

#### Stats

62.4K views
222 bookmarked
Posted 17 Dec 2012

# Building a General Purpose Interpreter, Math Engine and Parser in C#

, 13 Feb 2015
 Rate this:
An easy to understand and use language interpreter, run time parser and calculation engine from scratch in C#.

October 8, 2014: Math Processor is now Open Source.

## 1. Introduction

Before we get into details, let's have a look at a few examples of the commands that the parser/engine we are going to create can execute.

CommandsResultComment
>> primes(1000000)2 3 5 ...The first million prime numbers
>> gcd(24,12, 124, 136)4GCD of the given numbers
>> a = array(50, 10, 50)
>> b = array(60, 40, 50)
>> c = array(80, 60, 10)
>> m = matrix(a, b, c)
>> d = det(m)
-76000Determinant of a matrix
{
t = vectorin(-16, 0.05, 16);:
x = sin(8*t/5) * cos(t);:
y = sin(8*t/5) * sin(t);:
plot(x, y);:
}
A polar curve.

You can also create more interesting/advanced graphs like the following Spirograph patterns created by one of MP users (To learn more about Spirograph math and how these patterns were created, have a look at these Spirograph math tutorials/examples):

## 2. Using the Sample Applications

Update August 28, 2013: This section was written for the old versions. Since a new software with complete documentation became available (top downloads), I deleted the contents of this section so as to make the article easier/smaller for new readers. The old downloads are still there at these links to preserve the history:

## 3. Introduction to Math Processor

I hope that by this time you have played with the Math Processor application a bit. If not, I recommend that you do so now and try to get familiar with its working.

The Math Processor engine provides a lot more functions and commands than discussed in this article. If you are interested in knowing more about the facilities provided by Math Processor, you can have a look at the Math Processor Documentation.

We will be building an interpreter for math expressions as well as a simple little language MPCL (but still more powerful than you probably think). More on MPCL and what we will be implementing in next section. Before we get into that, here is what we will NOT be doing:

• Support symbolic math
• Perform arbitrary precision calculations
• Implement things the hard way

### 3.2. Math Processor Command Language (MPCL)

MPCL is a simple loosely typed language the Math Processor parser/calculating engine can interpret at run time to carry out the supported operations. MPCL supports:

• Basic math expressions containing the common operations like +, -, *, /, % etc.
• Boolean/logical operations
• Operations on numbers (including lists of numbers)
• Operations on Matrices
• Conditional Processing using the if and if-else constructs
• Loops (repeat and while)
• Certain directives to control a few aspects of the processing environment
• Inbuilt functions to expose the major facilities of the `System.Math` class of the .Net framework as well as a lot of other operations. For the complete list of supported functions, visit the Math Processor Functions page.
• User defined functions
• An extremely simple mechanism to allow programmers to write more useful functions and plug them into the main engine at run time.

## 4. Using the Math Processor Library

Math Processor provides a unified application storage and communication mechanism within the internal modules as well as for the programming interface through the use of class `Token`. A `Token` is capable of storing text, numbers, matrices, Boolean data and a lot more things that MP uses internally. What is contained within a `Token` depends upon its type, which is determined by a public property TokenType of type `TokenType`. This property is just an enum defined as under:

```public enum TokenType
{
Void, Error, Operator, Bool, Vector, Matrix, Text, Function, UserFunction, Directive,
Block, LoopOrCondition, FunctionDefiner, Break
}```

To the user of Math Processor, not all members of this enum are of interest. A few are for internal use only. We will learn about the important ones later in this text. The different fields defined in class `Token` are declared as under:

```TokenType type;
string name = "";
string data = "";
List<double> vector = new List<double>();
double extra; ```

And that's all that we need to support all the kinds of data we want to process. From numbers to Boolean values, from lists to matrices and all that. Amazing! Isn't it? The trick is to have different interpretations of the fields according to the type of the token. So for a Token of type Void, most fields mean nothing. It is just what it is called: Void. For a Boolean type, a 0 in the list would mean false and a 1 would mean true. For a matrix, we will make use of 'extra' to store the number of rows. We will store the actual matrix values in the `List<double>` field. We will compute the rows and columns for the matrix at run time.

Please note that by stuffing everything inside a `Token`, we have traded some efficiency (in terms of speed and memory) for simplicity. I believe the trade-off here is worth the cost, as we have avoided the need to create complex data structures. You can devise a more complex model if you wished. For me, life is much simpler without parse tables, expression trees and a thousand look up fields!

Now that we have understood the role as well as form of the class `Token` it is time to discover how to make use of the Math Processor library. There are two possibilities:

1. Directly send the raw expressions to the parsing engine, obtain the result and do whatever is needed with it.
2. Create instances of class `Token` in code and then use different public methods of the class library to manipulate them.

Both the approaches are quite simple. However, the first one is the preferred way to do things in most situations. In the sample applications we have primarily used the first approach. However, in the Console based sample we have also shown how to use the second method.

The second method is needed only when you want to extend the functionality of the application by providing more functions or when you want to directly work with the already defined functions inside the main class library. We shall see a use of this approach a bit later.

The use of the first method is just as easy as calling the following static method of `MathProcessor.Calculator` class:

`public static Token ProcessCommand (string expression) { ... }`

The parameter expression is any text that is a valid MPCL command. For a broad list of what constitutes a valid MPCL command please refer to Section 3.2. The result returned from this method would be a `Token` of any of the following types:

• `TokenType.Void` The operation completed successfully but nothing remarkable was returned to the caller. Note that the operation itself may have produced variables or functions etc. depending on the nature of the operation performed.
• `TokenType.Error` The operation resulted in an error.
• `TokenType.Text` The resulting `Token` is just plain text.
• `TokenType.Vector` The returned `Token` contains one or more numeric values.
• `TokenType.Matrix` The returned `Token` is a matrix.
• `TokenType.Bool` The returned `Token` contains Boolean data (true/false).

Another notable fact is that some operations produce intermediate results before the whole block of code is parsed and executed. For example, the following block of MPCL code produces output multiple times by use of the repeat loop (paste it in the demo application to see what it does!):

```{
format(G);:
a = 1;:
repeat(10)
{
echo("4 x ", a, "= ", 4*a):
a = a + 1;:
}:
}```

In order to inform the front-end about the intermediate output, the `Calculator` class raises events using the following delegate:

`public delegate void IntermediateResult(Token result);`

Now that we have understood the communication mechanism, let's build a very simple front end for the MP engine. Perform the following steps:

• Create a new Console application in Visual Studio.
• Modify your Program class as under:
```class Program
{
static void Main(string[] args)
{
Calculator.IntermediateResultProduced += new IntermediateResult(IntermediateResultProduced);
string input = "";
Console.ForegroundColor = ConsoleColor.White;
Console.WriteLine("Enter MP expressions to process. Enter 'exit' to quit:");
Console.ForegroundColor = ConsoleColor.Gray;
while (true)
{
if (input.Trim().ToLower() == "exit")
{
break;
}
Token result = Calculator.ProcessCommand(input);
DisplayResult(result);
}
}

static void IntermediateResultProduced(Token result)
{
DisplayResult(result);
}

static private void DisplayResult(Token result)
{
if (result.TokenType == TokenType.Error)
{
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine(">> " + result.GetString());
Console.ForegroundColor = ConsoleColor.Gray;
}
else if (result.TokenType == TokenType.Matrix)
{
int rows = (int)result.Extra;
if (rows > 0)
{
List<double> data = result.Vector.ToList();
Token temp;
temp = new Token(TokenType.Vector, data.GetRange(0, result.Count / rows).ToArray());
Console.WriteLine(">> " + temp.GetString());
for (int i = result.Count / rows; i < result.Count; i += result.Count / rows)
{
temp = new Token(TokenType.Vector, data.GetRange(i, result.Count / rows).ToArray());
Console.WriteLine("   " + temp.GetString());
}
}
}
else if (result.TokenType != TokenType.Void)
{
Console.WriteLine(">> " + result.GetString());
}
}
} ```
• That's it!

After writing this little program, almost all the MP functions are now available to us on the command line. However the function plot will not be available, as it creates a Windows Form and calls its Show() method, which requires a message loop not available to us in this example. The sample application uses a workaround. Please have a look at the downloadable code to see how the workaround is implemented.

### 4.1. Extending the Core by Adding More Command Line Functions

Another problem we have is the lack of our ability to paste multi-line code blocks in the Console application. Such an operation is not directly available. Let's write a new MP command line function to allow us to 'paste' text content directly from the Clipboard (the sample application contains this functionality). Add a reference to `Windows.Forms` assembly to the project and put the following method in your `Program` class:

```public static Token Paste(string operation, List<Token> arguments)
{
try
{
string text = System.Windows.Forms.Clipboard.GetText();
return Calculator.ProcessCommand(text);
}
catch (Exception)
{
return new Token(TokenType.Error, "Could not paste text from Clipboard.");
}
} ```

Now modify your `Main()` method as under:

```[STAThread]
static void Main(string[] args)
{
/* Everthing as before */
}```

Please note the use of `[STAThread]` Attribute with the `Main()` method. This is needed for working with the Clipboard. After making the above changes, you will have a new directive called "paste" available to you. You can also make it a function, if you so desired, like this:

`Function.AddFunction("paste", Paste);`

In MPCL, the difference between a directive and a function is that a function is always followed by a pair of parentheses containing zero or more parameters. Also the parameters to functions are processed before the function is called. For a directive, the parameters are not per-processed. For more detail about directives, please refer to Math Processor Directives page.

Most of the times you will be working with functions. The directives are just a few in number and mostly deal with the configuration of the engine and are not meant for real processing.

## 5. Understanding the Core Engine

Let's begin with a simple diagram showing the workflow of Math Processor.

The expression from the client is first converted into tokens by the parser/tokenizer. The list of tokens is then converted from Infix to Postfix (Reverse Polish) notation. After that, the tokens are processed by the calculation engine and the result is sent back to the client.

The conversion from Infix to Postfix form is done to facilitate the evaluation phase. Due to the way expressions are represented in Postfix notation, changing the priority of operations does not require use of parentheses, which simplifies the evaluation process. You can see the conversion from Infix to Postfix notation using the postfix directive of Math Processor. Here are a couple of examples:

>> postfix 2*9+3
>> 2 9 * 3 +
>> postfix 2*(9+3)
>> 2 9 3 + *

In the above examples, we can see there is not need to use parentheses in the postfix form, as the priority of operations is inbuilt in the very expressions. In the first example the multiplication takes precedence according to general rules of mathematics. However, in the second example, the Infix form changes the priority by use of parentheses. This changed priority is reflected in the Postfix form through re-ordering of the tokens, which enables us to get rid of the parentheses.

So far so good. But how does the process start and continue and end? Actually it is done in different stages. First, we need to parse the ordinary expression to create the tokens. Then we need to arrange these tokens in a suitable way for processing. After that comes the actual calculation. Finally the result is returned back to the user.

The whole process takes place inside 7 core classes consisting of just a few hundred lines of code. The details of these classes are as under:

### 5.1. Token

This is the ubiquitous data holder of Math Processor. We already discussed `Token` in detail earlier in this text in Part-4.

### 5.2. Tokenizer

This class is responsible for parsing the input and converting it into `Token` objects. The entire process takes place inside a static function:

`public static Token TokenizeString(String expression, out List<token> tokens)`

The first parameter 'expression' is the expression to be tokenized. The expression is any valid MPCL command (consisting of math expressions, functions, directives and so on). The second parameter `out List<Token> tokens` would contain the result of what tokens were found in the input expression. Lastly, the return value will tell about the result of the parsing and can be either of type `TokenType.Void` in case of success or `TokenType.Error` in case of erroneous input expression.

### 5.3. Variables

This class is responsible for storing reserved words and named constants and variables. It defines a small inner class `NamedToken` to store the named `Token` objects. A variable can be created using the assignment operator. Some MP functions can also create variables. The name of a variable must start with a letter or underscore and can consist of letters and numbers. Here is how a variable can be created and used:

>> a = 10
>> 10
>> b = a + 100
>> 110

### 5.4. Function

This class stores all the functions that can be used using MPCL function calls. Every MP command-line function should be of the form defined by the following delegate:

`public delegate Token FunctionDelegate(string operation, List<token> arguments); `

Let's see the whole process in action. MP provides a function named contains, which is used to check if a number is present in a matrix or array of numbers. This is how this function is implemented:

```public static Token Contains(string operation, List<Token> arguments)
{
if (arguments.Count != 2)

if ((arguments[0].TokenType != TokenType.Matrix && arguments[0].TokenType != TokenType.Vector) ||
(arguments[1].TokenType != TokenType.Matrix && arguments[1].TokenType != TokenType.Vector) ||
arguments[1].Count != 1)
return Token.Error("First argument should be an array or matrix and second a single numeric value");

if (arguments[0].Vector.Contains(arguments[1].FirstValue))
return new Token(TokenType.Bool, 1, 1);
else
return new Token(TokenType.Bool, 1, 0);
} ```

Let's now see how we can use this function:

>> a = array(1,2,3,4);
>> contains(a, 1)
>> true
>> contains(a, 5)
>> false
>> if (contains(a, 2)) { echo("Number found in the array."): }
>> Number found in the array.

### 5.5. FunctionDefiner

This class is responsible for creating, managing and executing User Defined Functions, which are a very important means to allow the users to create small amounts of code to extend the functionality of the engine at run time. User defined functions are not really efficient due to their interpreted nature. However, they are still useful for doing small chunks of work you need to perform repeatedly.

We already saw a very interesting example of a User Defined Function earlier in this text when we created some very interesting curves (Spirograph curves). To get more details and see other examples of User Defined functions, you can refer to the User Defined Functions page of the online MP documentation.

### 5.6. BlockCommands

This class is responsible for managing and executing the if/if-else, while and repeat constructs. The reason that these constructs are put together is their common syntactical form:

key_word(expression) { one or more statements }

For an explanation of each of these see the Conditional Statements and Loops pages.

### 5.7. Calculator

We already saw the `Calculator.ProcessCommand()` method in action, which was used to send all the expressions to the engine for processing. Besides providing that facility, this class also contains all the code to carry out basic arithmetic calculations involving the supported operators like +, -, * etc. It also handles the conversion from Infix to Postfix notation, determines the priority of operators and applies these operators to operands to obtain the results.

## Conclusion

Finally, we have reached the end. I hope you found the article, the main engine and the sample applications helpful. I also hope I was able to deliver my promise of creating a simple parsing library and calculation engine for you to use in your own projects.

## Share

I am a full-stack developer. My skills include JavaScript, C#/.Net, MS Azure cloud etc. I love to work on complex programming tasks requiring deep analysis, planning and use of efficient algorithms and data structures.

## You may also be interested in...

 First PrevNext
 Could you clarify the relationship between your math processor and the equation editor? Karl 200017-Mar-15 8:05 Karl 2000 17-Mar-15 8:05
 Re: Could you clarify the relationship between your math processor and the equation editor? Kashif_Imran17-Mar-15 9:08 Kashif_Imran 17-Mar-15 9:08
 My vote of 5 Gun Gun Febrianza5-Nov-14 1:55 Gun Gun Febrianza 5-Nov-14 1:55
 Re: My vote of 5 Kashif_Imran5-Nov-14 3:41 Kashif_Imran 5-Nov-14 3:41
 Wow, nice job! Cryptonite21-Oct-14 8:22 Cryptonite 21-Oct-14 8:22
 Re: Wow, nice job! Kashif_Imran21-Oct-14 9:52 Kashif_Imran 21-Oct-14 9:52
 My vote of 5 Bruno Sprecher18-Oct-14 1:29 Bruno Sprecher 18-Oct-14 1:29
 Re: My vote of 5 Kashif_Imran18-Oct-14 14:49 Kashif_Imran 18-Oct-14 14:49
 Praise JimboReynolds16-Jul-14 13:48 JimboReynolds 16-Jul-14 13:48
 Re: Praise Kashif_Imran16-Jul-14 14:04 Kashif_Imran 16-Jul-14 14:04
 Always thank you. WuRunZhe18-Nov-13 14:36 WuRunZhe 18-Nov-13 14:36
 Re: Always thank you. Kashif_Imran18-Nov-13 14:53 Kashif_Imran 18-Nov-13 14:53
 Download MathProcessor_1.0.1.0.zip (a complete software that uses the new class library) rpraver3-Sep-13 8:45 rpraver 3-Sep-13 8:45
 Re: Download MathProcessor_1.0.1.0.zip (a complete software that uses the new class library) Kashif_Imran3-Sep-13 19:18 Kashif_Imran 3-Sep-13 19:18
 Re: Download MathProcessor_1.0.1.0.zip (a complete software that uses the new class library) Kashif_Imran15-Nov-13 4:48 Kashif_Imran 15-Nov-13 4:48
 Bug with using double.Parse() Member 376142129-Aug-13 7:23 Member 3761421 29-Aug-13 7:23
 Re: Bug with using double.Parse() Kashif_Imran29-Aug-13 7:32 Kashif_Imran 29-Aug-13 7:32
 Re: Bug with using double.Parse() PieterDeRycke2-Jan-14 8:51 PieterDeRycke 2-Jan-14 8:51
 Re: Bug with using double.Parse() Jrgen Schildmann17-Feb-15 10:10 Jrgen Schildmann 17-Feb-15 10:10
 Well done Espen Harlinn28-Aug-13 20:16 Espen Harlinn 28-Aug-13 20:16
 Re: Well done Kashif_Imran28-Aug-13 23:05 Kashif_Imran 28-Aug-13 23:05
 A question Mizan Rahman29-Jul-13 22:25 Mizan Rahman 29-Jul-13 22:25
 Re: A question Kashif_Imran30-Jul-13 0:01 Kashif_Imran 30-Jul-13 0:01
 Congratulations! And a little tip victorbos15-Jan-13 13:34 victorbos 15-Jan-13 13:34
 Re: Congratulations! And a little tip Kashif_Imran15-Jan-13 17:38 Kashif_Imran 15-Jan-13 17:38
 Last Visit: 31-Dec-99 18:00     Last Update: 28-Jun-17 11:08 Refresh 12 Next »