## Introduction

Recently, I came across the need to create binary trees based on a mathematical expression as its input. In my math class we were covering a lesson on graph theory, and as an assignment we were to take expressions, put them in trees and evaluate them. Also part of the lesson was to convert from several notations to several others- a daunting task, especially without the help of a tree.

It was because of this that I wrote this program. It is a very basic, short class that implements the functionality needed to solve expression trees, as well as output their structure in prefix, postfix, and infix format. Though there are many features that are left unimplemented, this example was based on algorithms I have learned (and am learning) as I go. If anyone has anything to add (or change) to this class, please let me know!

## About notations

Let's quickly go over to the different notations used in this class. Three functions are provided to output expressions in postfix, prefix, and infix formats. Two of these are almost identical, and the third one everyone is familiar with, as it is how we write equations normally.

### Infix

Of the three notations, infix is the one we are all most familiar with. In this notation, all binary operators appear between the two operands. This provides several advantages to the other two notations- mainly, it is easy to see which operators are bound to which operands. At a glance (at least in my opinion) it is also easier to solve these equations mentally. Infix does create a few problems, however. Because operators appear where they do, when they are stacked and equations become more complex, it's hard to distinguish one operand from another. The solution is to use parentheses to group operations. Infix appears as:

((1+2) * (3-4))

**Note**: The way this code was implemented, expressions represented in infix notation will always be fully parenthesized. I can't get around this, and though they aren't necessary, they obviously do not affect the outcome.

Expressions in infix are solved by starting from the innermost set of parentheses and working outwards. Rules of precedence must also be followed, due to the possible ambiguity in interpretation.

Graphing infix expressions in a tree is fairly complicated because of the order of operations. You must start with a fully parenthesized equation, as this will make it clear what the "main" operator is (this is the operator that adds the other two sides of the expression, which makes up the whole expression). In the example above, the multiplication sign is the main operator. This will make up the root node of the tree. Each node under it will be determined by the groups that are most tightly bound to the main operator.

### Prefix

This type of notation is arguably the easiest to use. In prefix, all operators appear near the beginning of the equation, with the two operands appearing right after. When operations are stacked, it becomes difficult to see how these operands are grouped, but because of the nature of prefix notation, no parentheses are needed to group them. To stop nodes (the operands) from running together, spaces are added between each one of them. Prefix appears as:

* + 1 2 - 3 4

Expressions in prefix are solved by scanning the equation for an operator with two immediate values to the right of it. This is continued and repeated until there are no more operators left.

Graphing prefix expressions follow a simple algorithm, which is as follows:

- Write operator.
- Go left unless the node is an immediate value.
- When going left is unavailable, go to the node above you and go right.

This algorithm is left-recursive.

### Postfix

Quite simply, postfix is just a variant of prefix. Instead of appearing at the beginning of the expression, the operators will appear near the end. This type of notation is very common in programming languages and such, as it is easily solvable using stacks. Postfix appears as:

1 2 + 3 4 - *

Expressions in postfix are solved by traveling down the tree (to the left) until an immediate value is reached. The idea is that an operator can't be written until all the values under it are present. When moving left can't be done, move right. When both left and right values of a node are written down, the operator binding them can be written.

## Using the code

The code itself is encapsulated in a simple class with methods for returning the expression in all three notations, as well as a function to solve a tree.

`Node`

class methods:

class Node
{
public int Solve()
public string Prefix()
public string Postfix()
public string Infix()
public Node(char op, Node l, Node r)
public Node(string value)
private Node left;
private Node right;
private string Value;
}

The `Node`

class contains everything needed to set up and evaluate expressions. To create a tree, simply call the constructor for the class. Sorry, I haven't added a parser, so this must be done manually:

Node tree = new Node('*', new Node('+', new Node("1"), new Node("2")),
new Node('-', new Node("3"), new Node("4"))
);

Notice that the constructor takes the operator first, then the left and right values. Alternatively, the constructor will also take a single string value for leaves (immediate values).

Now, to solve the tree, simply call the `Solve`

method:

Console.WriteLine("The answer is: {0}", tree.Solve());

Similarly, to find the prefix, infix, or postfix notations for the expression, call the respective functions, which all return strings.

Console.WriteLine("The prefix notation of the expression is\n{0}",
tree.Prefix());

Simple, huh?

## Improvements

The program as it stands could do with many improvements. Here are a few I have thought of, and might get around to adding (though if you would like to implement them, please share with us!):

**Parser**- The program lacks the major functionality of an expression parser, taking an expression and building a tree from it. This could be accomplished with regular expressions, or even stack manipulation.
**Error checking**- Presently, no validation is performed on the expression to test whether it is solvable. Non-numeric values will confuse the program, as will expressions with too many signs, wrong placement, etc.. The responsibility is currently placed on the user to provide a good expression.
**General purpose**- The application of these types of trees can be extended to a broader field of data manipulation. It would be interesting to see classes developed to handle any type of data, not just numeric.
**Node data**- The tree is currently only able to support whole numbers (integers). Though easily changed, this was done intentionally to keep the example plain and easy to understand.
**Limited size**- Because of the way the solving algorithm and the notation functions work, trees are limited in size. This is due to the fact that the functions recurs to find their results, and as such, the internal stack fills up quite quickly. Very large trees will cause an overflow and the program will stop.

## Conclusion

Writing this class proved easier than I had originally thought, and it has definitely sped up my school work! Using simple recursion techniques, it is almost trivial to manipulate binary trees in any fashion.

Ideally, the class could use a few other functions (as listed above), but as it stands I thought it was quite helpful. The code is commented and very detailed, so you should have no problem understanding it- a rather simple class anyway.

If you should have any questions or comments, I always love feedback :) Post a thread and I'll be sure to check it out. Also remember, to make changes and add to the code presented here--if you post it, I'll add it in and re-upload the code as soon as possible.

## History

- 7
^{th} May, 05- Initial release.

James Brannan is a 16 year old programmer attending high school in Lake Saint Louis, Missouri. He enjoys writing code in Perl, C++, and .NET in his free time, but also enjoys the finer things in life -- sleeping and eating! His homepage is at http://www.binary-craft.com and http://www.domainjames.com