## Introduction

This library evaluates formulas in strings to a value, e.g. "2+4*6+2" returns 28.

## Background

### Infix

Simply put infix notation is the notation used by use mere mortals, e.g. 2+4*6+2. In infix notation the operator (e.g. + or *) is inside the operands (e.g. 2, 4, and 6). While this notation is simple for a human to understand it is not so convienient for processing, as the computer typically needs to know what to do before it gets the specifics of what the operand values actually are, hence prefix notation.

### Prefix

Prefix notation or a variation of is used by most programming languages and processing arithmetic logic units (ALU), and can be defined as operator operand_1 operand_2 operand_3 and so on. Consider the following infix formula 2 + 4 * 6 + 2, this can be shown in prefix notation as + 2 + * 4 6 2, or perhaps more logically as a tree;

Processing is then performed as + 2 (+ (* 4 6) 2), with evaluation of the formula proceeding up the tree;

The tree structure representation of prefix forms the basis for this library, however it is undesirable to ask user input in this form due to limit comprehension by the general lay-person.

### Infix -> Prefix Conversion

Infix to prefix conversation is performed on a stack using a tokenisor (in this case generating tokens via space edges).

- Reverse the input string.
- Examine the next element in the input.
- If it is operand, add it to output string.
- If it is Closing parenthesis, push it on stack.
- If it is an operator, then
- If stack is empty, push operator on stack.
- If the top of stack is closing parenthesis, push operator on stack.
- If it has same or higher priority than the top of stack, push operator on stack.
- Else pop the operator from the stack and add it to output string, repeat step 5.

- If it is a opening parenthesis, pop operators from stack and add them to output string until a closing parenthesis is encountered. Pop and discard the closing parenthesis.
- If there is more input go to step 2.
- If there is no more input, unstack the remaining operators and add them to output string.
- Reverse the output string.

Courtesy of Expressions, Conversion and Evaluation with C.

Suppose we want to convert 2 + 4 * 6 + 2 into Prefix form: Reversed Expression: 2 + 6 * 4 + 2

Current Token | Stack Contents (Top on Right) | Prefix Expression (Right to Left) |
---|

2 | | 2 |

+ | + | 2 |

4 | + | 2 4 |

4 | + | 2 4 |

* | + * | 2 4 |

6 | + * | 2 4 6 |

+ | + + | 2 4 6 * |

2 | + + | 2 4 6 * 2 |

| + | 2 4 6 * 2 + |

| | 2 4 6 * 2 + + |

Reverse the output string : + + 2 * 4 6 2

## Processing a Dynamic Formula

Each mathematical operator is encapsulated via a class containing the operator and all operands needed by the operator (remembering that the operands may be other mathematical operators and operands), thus allowing the tree pattern as above.

Note: in this implementation of dynamic formula processing, not only may the standard mathematical operators and values be processed, but constants (e.g. x and y) as well. In terms of infix to prefix conversion, values and constants are treated in the same way as during this process the value of the token is not important in so much as the type is more important.

Each operator class contains the following methods;

- String parse(String s); - parses (and sub parses the string); returns the remainder of the string to process
- double evaluate(Hashtable variables); - returns the evaluation, using the appropriate variables as defined in the variable hash table
- bool nextIs(String s); - returns true if the next token is one of these operators
- ArrayList getVariableNames(); - returns a list of all required constants in this operator and sub-operators

The parse method proceeds as follows (* assuming an operator requiring two operands);

- Read and remove first token; i.e. the operator
- Read first token (i.e. an operand)
- If the operand is a value or constant remove the token and save as operand 1
- Else determine the type of operator and call parse using the remaining input string (saving as operand 1)
- Using the remaining string (or result of operand 1's parse if operand 1 is really an operator) remove and save as operand 2 if token is a value or constant
- Else determine the type of operator and call parse using the remaining input string (saving as operand 2)
- Return the remaining string (or result of operand 2's parse if operand 2 is really an operator).

Thus, processing may complete as follows;

- Get formula string
- Convert to prefix notation
- Determine operator type of first token
- Parse whole formula string using relevant operator type

Once complete a tree data structure is formed as follows for the example 2 + 4 * 6 + 2 or + + 2 * 4 6 2;

<add><add>2, <multiply>4, 6</multiply></add>, 2</add>

And hence calling the recursive evaluate method performs;

- * 4 6
- + 2 24
- + 2 26
- 28

In cases where constants are performed, the evaluate method performs a lookup in the supplied Hashtable.

## Using the Code

The library supplied can evaluate mathematic functions + - * / ^(power).

- + encapsulated by Add
- - encapsulated by Subtract
- * encapsulated by Multiply
- / encapsulated by Divide
- ^ encapsulated by Power

In addition;

- values encapsulated by Value
- constants encapsulated by Variable

To use the library simply create a new Formula giving the formula string in the constructor. Calling evaluate with a Hashtable on constants (or new Hashtable() if none exist) returns the formula result.

## Points of Interest

Is this the best way to evaluated formulas dynamically? No, the performance of this approach is not good when many formulas are to be processed due to the high number of method calls when evaluating. However, this method is particularly beneficial when using in conjunction with genetic algorithms (evolutionary algorithms) due to the ease of mutation and crossover. My next article on the subject will show how to use this library to generate a curve function that automatically fits ANY trend.

## History

Version 1.0.0.0 - Initial build

## Additional Licensing Notes

Please feel free to use this in your work, however please be aware that a modified The Code Project Open License (CPOL) is in use; basically it is the same as the standard license except that this code must not be used for commercial or not-for-profit commercial use without prior authorisation. Please see license.txt or license.pdf in the included source and demo files.

I originally studied for a masters in engineering of software engineering at The University of Birmingham during 2000-2004, of which I received a 2:1. I continued at Birmingham University working with Civil Engineering and Rail Research UK where I am currently in my final year of a 3 year PhD project developing a Computational Intelligent Approach to Railway Intervention Planning. Although my work has a significant focus on railway engineering and associated practices much of my work is with data mining (on SQL Server 2008) and computational intelligence (CI) techniques. My key areas of expertise in CI are clustering algorithms (including Rival Penalised Competitive Learning) and evolutionary algorithms.

Outside of my formal work I enjoy testing the latest technologies such as .NET 3.5 and the many frameworks of which it comprises (mainly WPF). I have several projects on the go including a .NET and DirectX port of Quake 3 and many utility libraries. I also maintain an extensive website coded in Cold Fusion which is regularly updated; more information is available about me there.