Click here to Skip to main content
14,454,263 members

Evaluating expression with descend parser and U++ CParser

Rate this:
3.50 (5 votes)
Please Sign up or sign in to vote.
3.50 (5 votes)
1 Feb 2016BSD
Simple descend parser capable of evaluating mathematical expressions used to draw a graph of a function.

Introduction

Descend parsing is a well known simple approach to evaluating complex syntaxes. In this article we shall create a simple descend parser capable of evaluating mathematical expressions, using U++ CParser class for lexical analysis. And to have some fun, we shall add an interactive level to render a graph of function.

Image 1

 

Descend parsing basics

Descend parsing usually works by providing a function for the each level of syntax. If evaluating expressions, our main concern usually is to maintain the priority of operators. In descent parser, each priority level is represented by a single function. Usually, expressions can be represented as addition/subtraction of terms (descend level), terms are formed by multiplying or dividing exponentials (descend level), exponentials combine factors (descend level). Factors usually represent the final descend level. Expression in parenthesis is considered a factor and resolved using recursion.
Usually, syntax definitions provide some formal syntax graphs or definitions, however in this simple example, we shall avoid this - as you will see the properly coded descent parser is very self-explaining to the point of serving as syntax definition.

Lexical parsing with CParser

Most syntax parsing systems employ 'two-level' parsing architecture, with lower level lexical parser used to resolve items like numbers or identifiers. U++ framework provides a handy lexical parser class CParser for parsing syntax that has similar lexical rules of identifiers and numbers to C language. CParser operates on a stream of zero-terminated characters (const char *). Fundamental methods are tests on existence of various lexical elements, usually also in "advance if true" variant. For example, Char method returns true if input position is on specified character and if true, advances position to the next element. Whitespaces are automatically skipped in default mode. Some of methods throw CParser::Exc exception in case of failure.

Expression descend parser

With the theory explained, let us start writing the code. We shall use a helper processing class

struct ExpressionEvaluator {
    CParser p; // this will hold our lexical context
    double  x; // 'variable'

    double Factor();
    double Exponent();
    double Term();
    double Expression();
};

so that we do not have to pass CParser and our function variable x into each method as parameters. We shall start with Expression descend level:

double ExpressionEvaluator::Expression()
{ // resolve + -
    double y = Term(); // at least one term
    for(;;)
        if(p.Char('+'))
            y = y + Term(); // add another term
        else
        if(p.Char('-'))
            y = y - Term(); // subtract another term
        else
            return y; // no more + - operators
}

what we say here is quite simple: Expression is a list of at least one term, separated by operators + or -. Of course, as our aim is to evaluate the value of expression, we perform required calculations (adding or subtracting) as needed.

Next two levels (Term, Exponent) are quite similar:

double ExpressionEvaluator::Term()
{ // resolve * /
    double y = Exponent(); // at least one member
    for(;;)
        if(p.Char('*'))
            y = y * Exponent(); // multiply by another member
        else
        if(p.Char('/'))
            y = y / Exponent(); // divide by another member
        else
            return y; // no more * / operators
}

double ExpressionEvaluator::Exponent()
{ // resolve power ^
    double y = Factor(); // at least one factor
    for(;;)
        if(p.Char('^'))
            y = pow(y, Factor()); // power by another factor
        else
            return y; // no more power ^ operators
}

Final level, Factor, has to resolve numeric constants, unary -, variable x and functions. We can use Id method of CParser to detect various function names and other meaningful identifiers:

double ExpressionEvaluator::Factor()
{
    if(p.Char('-')) // unary -
        return -Factor();

    if(p.Id("x")) // our variable
        return x;
    if(p.Id("e")) // e constant
        return M_E;
    if(p.Id("pi")) // pi constant
        return M_PI;

    if(p.Id("abs")) // some functions...
        return fabs(Factor());
    if(p.Id("sqrt"))
        return sqrt(Factor());
    //... add more functions as needed
    if(p.Char('(')) { // resolve parenthesis - recurse back to Expression (+ - operators)
        double y = Expression();
        p.PassChar(')'); // make sure there is closing parenthesis
        return y;
    }
    return p.ReadDouble(); // last possibility is that we are at number...
}

Interesting part here is the recursion used to resolve parenthesis (and also arguments to functions and unary -). PassChar method throws an exception if required character is missing. ReadDouble throws an exception if input cannot be interpreted as floating point number.
However, the list of functions here is tedious and scaning through long lists of identifiers could be slow so the better solution is to use map. We can use C++11 lamdas to simplify coding:

double ExpressionEvaluator::Factor()
{
    if(p.Char('-')) // unary -
        return -Factor();
    
    if(p.IsId()) { // we are at some identifier
        static ArrayMap<String, std::function<double (ExpressionEvaluator *ee)> > map;
        ONCELOCK {
            map // initialize the map
                ("x", [](ExpressionEvaluator *ee) { return ee->x; })
                ("e", [](ExpressionEvaluator *) { return M_E; })
                ("pi", [](ExpressionEvaluator *) { return M_PI; })
                ("ln", [](ExpressionEvaluator *ee) { return log(ee->Factor()); })
            #define FN_(id) (#id, [](ExpressionEvaluator *ee) { return id(ee->Factor()); })
                FN_(abs) // some functions...
                FN_(sqrt)
                FN_(log)
                FN_(log2)
                FN_(sin)
                FN_(cos)
                FN_(tan)
                FN_(asin)
                FN_(acos)
                FN_(atan)
            #undef FN_
            ;
        }
        int q = map.Find(p.ReadId());
        if(q < 0)
            p.ThrowError("Unrecongnized identifier");
        return map[q](this);
    }

    if(p.Char('(')) { // resolve parenthesis - recurse back to Sum (+ - operators)
        double y = Expression();
        p.PassChar(')'); // make sure there is closing parenthesis
        return y;
    }
    return p.ReadDouble(); // last possibility is that we are at number...
}

Instead of checking individual identifiers, we just check that identifier is present with IsId and then load it with ReadId. For map, we are using U++ ArrayMap. ArrayMap::operator() is a convenient alternative for adding key-value pairs in chain (return *this). ONCELOCK macro only allows initilization code to be performed once (if multiple threads encounter the same ONCELOCK at the same time, one thread runs the initialization code while other waits until it is completed). We need to pass ExpressionEvaluator pointer as parameter to lambdas as we want to have single identifier map for all instances of ExpressionEvaluator. Trick with temporary FN_ macro is used to reduce verbosity when the C function name matches our required function id.

Working directly with ExpressionEvaluator would not be comfortable, so we add an convenience function:

double Evaluate(const char *s, double x)
{ // evaluate expression for given variable, return Null on any failure or out-of-bounds result (NaN)
    ExpressionEvaluator v;
    v.x = x;
    v.p.Set(s);
    try {
        double y = v.Expression();
        return y >= -1e200 && y < 1e200 ? y : Null;
    }
    catch(CParser::Error) {}
    return Null;
}

Null is U++ concept that represents null value. It is in fact defined as very high negative number that is by definition excluded from valid double value range (same concept is used for int). We use it here to signal that either the syntax is wrong or result is not a valid number.

Drawing the Graph of a function

Now we can evaluate text expression for x, let us add some GUI. Window will be trivial, just single input text field and area to draw the graph. We shell setup the widget position manually:

struct FnGraph : public TopWindow {
    virtual void Paint(Draw& w);

    EditString expression; // function to display

    FnGraph();
};

FnGraph::FnGraph()
{
    Title("Graph of a function");
    Add(expression.TopPos(0).HSizePos()); // place widget to the top, horizontally fill the window
    expression << [&] { Refresh(); }; // when expression changes, repaint the graph
    Sizeable().Zoomable(); // make the window resizable
}

The only moderately complex thing to do is to paint the graph:

void FnGraph::Paint(Draw& w_)
{
    Size sz = GetSize();
    DrawPainter w(w_, sz); // Use Painter for smooth sw rendering
    w.Clear(White()); // clear the background
    int ecy = expression.GetSize().cy; // query the height of widget
    w.Offset(0, ecy); // move coordinates out of widget
    sz.cy -= ecy; // and reduce the size
    if(sz.cy < 1) // if too small, do nothing (avoid div by zero)
        return;
    sz = sz / 2 * 2 - 1; // this trick will force axes to .5, results in sharper AA rendering
    double pixels_per_unit = sz.cy / 9; // we want to display y range -4.5 .. 4.5
    double xaxis = sz.cy / 2.0; // vertical position of x axis
    double yaxis = sz.cx / 2.0; // horizontal position of y axis
    w.Move(0, xaxis).Line(sz.cx, xaxis).Stroke(1, Blue()); // draw x axis
    w.Move(yaxis, 0).Line(yaxis, sz.cy).Stroke(1, Blue()); // draw y axis
    Font font = Serif(15);
    if(pixels_per_unit > 20) // if big enough, paint some axis markers and numbers...
        for(int i = 1; i < 2 * sz.cx / pixels_per_unit; i++)
            for(int sgn = -1; sgn < 2; sgn += 2) {
                String n = AsString(sgn * i);
                Size tsz = GetTextSize(n, font);

                double x = yaxis + sgn * i * pixels_per_unit;
                w.Move(x, xaxis - 5).Line(x, xaxis + 5).Stroke(1, Blue());
                w.Text(int(x - tsz.cx / 2.0), int(xaxis + 6), n, font).Fill(Blue());

                double y = xaxis - sgn * i * pixels_per_unit;
                w.Move(yaxis - 5, y).Line(yaxis + 5, y).Stroke(1, Blue());
                w.Text(int(yaxis + 6), int(y - tsz.cy / 2.0), n, font).Fill(Blue());
            }
    double y0 = Null; // store previous value
    for(int i = 0; i < sz.cx; i++) { // now iterate through all x pointes and draw the graph
        double x = (i - sz.cx / 2.0) / pixels_per_unit;
        double y = Evaluate(~~expression, x);
        if(!IsNull(y)) {
            double gy = sz.cy / 2.0 - y * pixels_per_unit;
            if(IsNull(y0)) // previous value was defined
                w.Move(i, gy);
            else
                w.Line(i, gy);
        }
        y0 = y;
    }
    w.Stroke(1, Black()); // finally paint the graph line
}

We are using Painter package to provide smooth anti-aliased rendering. About the most complex part of painting the graph is to draw axes, the rest is simple. As this is more or less illustration example, we do not bother about GUI for scaling the and moving the origin, just draw +/- 4.5 in y axis and place the origin into the center.

Conclusion

Example demonstrates how simple and clean can descend parsing be made if using well thought lexical layer. Similar approach can be applied to the most syntaxes encountered in general computing.

Useful Links

Article history

2016-02-01 New version of Factor method using C++11 lambdas and map

License

This article, along with any associated source code and files, is licensed under The BSD License

Share

About the Author

Miroslav Fidler
Czech Republic Czech Republic
Mirek Fidler is C/C++ programmer for more than 20 years. He is a coauthor of U++ framework.

Comments and Discussions

 
QuestionU++ in IDE Visual Studio? Pin
TSchind8-Feb-16 9:13
MemberTSchind8-Feb-16 9:13 
AnswerRe: U++ in IDE Visual Studio? Pin
Miroslav Fidler8-Feb-16 23:26
MemberMiroslav Fidler8-Feb-16 23:26 
QuestionYou could use Func<double, double> to represent functions. Pin
Philippe Mori25-Jan-16 16:02
MemberPhilippe Mori25-Jan-16 16:02 
AnswerRe: You could use Func<double, double> to represent functions. Pin
Miroslav Fidler25-Jan-16 20:41
MemberMiroslav Fidler25-Jan-16 20:41 
GeneralRe: You could use Func<double, double> to represent functions. Pin
Philippe Mori26-Jan-16 3:39
MemberPhilippe Mori26-Jan-16 3:39 
AnswerRe: You could use Func<double, double> to represent functions. Pin
Miroslav Fidler25-Jan-16 21:39
MemberMiroslav Fidler25-Jan-16 21:39 
GeneralRe: You could use Func<double, double> to represent functions. Pin
Philippe Mori26-Jan-16 3:45
MemberPhilippe Mori26-Jan-16 3:45 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Article
Posted 25 Jan 2016

Tagged as

Stats

12.7K views
432 downloads
9 bookmarked