We will create a simple descend parser that is capable of evaluating mathematical expressions, using U++ CParser class for lexical analysis, and also add in an interactive level to render a graph of function. The example demonstrates how simple and clean descend parsing can be made if using well thought out lexical layer.

## Introduction

Descend parsing is a well known simple approach to evaluating complex syntaxes. In this article, we shall create a simple descend parser capable of evaluating mathematical expressions, using U++ `CParser`

class for lexical analysis. And to have some fun, we shall add an interactive level to render a graph of function.

## Descend Parsing Basics

Descend parsing usually works by providing a function for each level of syntax. If evaluating expressions, our main concern usually is to maintain the priority of operators. In descent parser, each priority level is represented by a single function. Usually, expressions can be represented as addition/subtraction of **terms** (*descend level*), terms are formed by multiplying or dividing **exponentials** (*descend level*), exponentials combine **factors** (*descend level*). Factors usually represent the final descend level. Expression in parenthesis is considered a factor and resolved using recursion.

Usually, syntax definitions provide some formal syntax graphs or definitions, however in this simple example, we shall avoid this - as you will see, the properly coded descent parser is very self-explanatory to the point of serving as syntax definition.

## Lexical Parsing with CParser

Most syntax parsing systems employ 'two-level' parsing architecture, with lower level *lexical* parser used to resolve items like numbers or identifiers. U++ framework provides a handy lexical parser class `CParser`

for parsing syntax that has similar lexical rules of identifiers and numbers to C language. `CParser`

operates on a stream of zero-terminated characters (`const char *`

). Fundamental methods are tests on existence of various lexical elements, usually also in "advance if true" variant. For example, `Char`

method returns `true`

if input position is on specified character and if `true`

, advances position to the next element. Whitespaces are automatically skipped in default mode. Some of methods throw `CParser::Exc`

exception in case of failure.

## Expression Descend Parser

With the theory explained, let us start writing the code. We shall use a helper processing class...

struct ExpressionEvaluator {
CParser p; double x;
double Factor();
double Exponent();
double Term();
double Expression();
};

...so that we do not have to pass `CParser`

and our function variable `x`

into each method as parameters. We shall start with Expression descend level:

double ExpressionEvaluator::Expression()
{ double y = Term(); for(;;)
if(p.Char('+'))
y = y + Term(); else
if(p.Char('-'))
y = y - Term(); else
return y; }

What we say here is quite simple: Expression is a list of at least one **term**, separated by operators `+`

or `-`

. Of course, as our aim is to evaluate the value of expression, we perform required calculations (adding or subtracting) as needed.

The next two levels (`Term`

, `Exponent`

) are quite similar:

double ExpressionEvaluator::Term()
{ double y = Exponent(); for(;;)
if(p.Char('*'))
y = y * Exponent(); else
if(p.Char('/'))
y = y / Exponent(); else
return y; }
double ExpressionEvaluator::Exponent()
{ double y = Factor(); for(;;)
if(p.Char('^'))
y = pow(y, Factor()); else
return y; }

Final level, `Factor`

, has to resolve numeric constants, unary `-`

, variable `x`

and functions. We can use `Id`

method of `CParser`

to detect various function names and other meaningful identifiers:

double ExpressionEvaluator::Factor()
{
if(p.Char('-')) return -Factor();
if(p.Id("x")) return x;
if(p.Id("e")) return M_E;
if(p.Id("pi")) return M_PI;
if(p.Id("abs")) return fabs(Factor());
if(p.Id("sqrt"))
return sqrt(Factor());
if(p.Char('(')) { double y = Expression();
p.PassChar(')'); return y;
}
return p.ReadDouble(); }

The interesting part here is the recursion used to resolve parenthesis (and also arguments to functions and unary `-`

). `PassChar`

method throws an exception if required character is missing. `ReadDouble`

throws an exception if input cannot be interpreted as floating point number.

However, the list of functions here is tedious and scanning through long lists of identifiers could be slow so the better solution is to use map. We can use C++11 lambdas to simplify coding:

double ExpressionEvaluator::Factor()
{
if(p.Char('-')) return -Factor();
if(p.IsId()) { static ArrayMap<String, std::function<double (ExpressionEvaluator *ee)> > map;
ONCELOCK { map ("x", [](ExpressionEvaluator *ee) { return ee->x; })
("e", [](ExpressionEvaluator *) { return M_E; })
("pi", [](ExpressionEvaluator *) { return M_PI; })
("ln", [](ExpressionEvaluator *ee) { return log(ee->Factor()); })
#define FN_(id) (#id, [](ExpressionEvaluator *ee) { return id(ee->Factor()); })
FN_(abs) FN_(sqrt)
FN_(log)
FN_(log2)
FN_(sin)
FN_(cos)
FN_(tan)
FN_(asin)
FN_(acos)
FN_(atan)
#undef FN_
;
}
int q = map.Find(p.ReadId());
if(q < 0)
p.ThrowError("Unrecognized identifier");
return map[q](this);
}
if(p.Char('(')) { double y = Expression();
p.PassChar(')'); return y;
}
return p.ReadDouble(); }

Instead of checking individual identifiers, we just check that identifier is present with `IsId`

and then load it with `ReadId`

. For map, we are using U++ `ArrayMap`

. `ArrayMap::operator()`

is a convenient alternative for adding key-value pairs in chain (`return *this`

). `ONCELOCK`

macro only allows initialization code to be performed once (if multiple threads encounter the same `ONCELOCK`

at the same time, one thread runs the initialization code while the other waits until it is completed). We need to pass `ExpressionEvaluator`

pointer as parameter to lambdas as we want to have single identifier map for all instances of `ExpressionEvaluator`

. Trick with temporary `FN_`

macro is used to reduce verbosity when the C function name matches our required function id.

Working directly with `ExpressionEvaluator`

would not be comfortable, so we add a convenience function:

double Evaluate(const char *s, double x)
{ ExpressionEvaluator v;
v.x = x;
v.p.Set(s);
try {
double y = v.Expression();
return y >= -1e200 && y < 1e200 ? y : Null;
}
catch(CParser::Error) {}
return Null;
}

`Null`

is a U++ concept that represents **null value**. It is in fact defined as a very high negative number that is by definition excluded from valid double value range (same concept is used for `int`

). We use it here to signal that either the syntax is wrong or result is not a valid number.

## Drawing the Graph of a Function

Now we can evaluate text expression for `x`

, let us add some GUI. Window will be trivial, just single input text field and area to draw the graph. We shall setup the widget position manually:

struct FnGraph : public TopWindow {
virtual void Paint(Draw& w);
EditString expression;
FnGraph();
};
FnGraph::FnGraph()
{
Title("Graph of a function");
Add(expression.TopPos(0).HSizePos()); expression << [&] { Refresh(); }; Sizeable().Zoomable(); }

The only moderately complex thing to do is to paint the graph:

void FnGraph::Paint(Draw& w_)
{
Size sz = GetSize();
DrawPainter w(w_, sz); w.Clear(White()); int ecy = expression.GetSize().cy; w.Offset(0, ecy); sz.cy -= ecy; if(sz.cy < 1) return;
sz = sz / 2 * 2 - 1; double pixels_per_unit = sz.cy / 9; double xaxis = sz.cy / 2.0; double yaxis = sz.cx / 2.0; w.Move(0, xaxis).Line(sz.cx, xaxis).Stroke(1, Blue()); w.Move(yaxis, 0).Line(yaxis, sz.cy).Stroke(1, Blue()); Font font = Serif(15);
if(pixels_per_unit > 20) for(int i = 1; i < 2 * sz.cx / pixels_per_unit; i++)
for(int sgn = -1; sgn < 2; sgn += 2) {
String n = AsString(sgn * i);
Size tsz = GetTextSize(n, font);
double x = yaxis + sgn * i * pixels_per_unit;
w.Move(x, xaxis - 5).Line(x, xaxis + 5).Stroke(1, Blue());
w.Text(int(x - tsz.cx / 2.0), int(xaxis + 6), n, font).Fill(Blue());
double y = xaxis - sgn * i * pixels_per_unit;
w.Move(yaxis - 5, y).Line(yaxis + 5, y).Stroke(1, Blue());
w.Text(int(yaxis + 6), int(y - tsz.cy / 2.0), n, font).Fill(Blue());
}
double y0 = Null; for(int i = 0; i < sz.cx; i++) { double x = (i - sz.cx / 2.0) / pixels_per_unit;
double y = Evaluate(~~expression, x);
if(!IsNull(y)) {
double gy = sz.cy / 2.0 - y * pixels_per_unit;
if(IsNull(y0)) w.Move(i, gy);
else
w.Line(i, gy);
}
y0 = y;
}
w.Stroke(1, Black()); }

We are using `Painter`

package to provide smooth anti-aliased rendering. About the most complex part of painting, the graph is to draw axes, the rest is simple. As this is more or less an illustration example, we do not bother about GUI for scaling and moving the origin, just draw +/- 4.5 in y axis and place the origin into the center.

## Conclusion

The example demonstrates how simple and clean descend parsing can be made if using well thought lexical layer. Similar approach can be applied to most syntaxes encountered in general computing.

## Useful Links

## Article History

- 2016-02-01: New version of
`Factor`

method using C++11 lambdas and map

Mirek Fidler is C/C++ programmer for more than 20 years. He is a coauthor of U++ framework.