Flee is a .NET library that allows you to parse and evaluate arbitrary expressions. The main feature that distinguishes it from other expression evaluators is that it uses a custom compiler to convert expressions into IL and then, using Lightweight CodeGen, emits the IL to a dynamic method at runtime. This means that expression evaluation is several orders of magnitude faster than when using interpretive expression evaluators. In fact, the entire design of the library and the expression language is geared towards making evaluation as fast as possible.
Here a list of the major features:
- Fast and efficient expression evaluation
- Compiles expressions to IL using a custom compiler
- Small, lightweight library
- Code generated for expressions can be garbage-collected
- Does not create any dynamic assemblies that stay in memory
- Supports all arithmetic operations including the power (^) operator
- Supports string, char, boolean, and floating-point literals
- Supports 32/64 bit, signed/unsigned, and hex integer literals
- Features a true conditional operator
- Emits short-circuited logical operations
- Allows for customization of expression compilation
About Lightweight CodeGen
Lightweight CodeGen (LCG) is a feature introduced in .NET 2.0 to allow for efficient runtime code generation. Using its
DynamicMethod class, you are able to create a method at runtime, set the code of the method body, and finally call the created method. Its main advantages are that the generated method and its code can be garbage collected when no longer referenced, and it does not generate any dynamic assemblies that stay in memory. One of its weaknesses is that the only way to create the method body is to emit IL. Since IL is basically a form of assembly language, performing even simple tasks requires lots of instructions and an understanding of the low-level workings of the CLR. For example, here is the IL required to evaluate the expression
L_0001: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now()
L_0007: ldloca.s dt
L_0009: constrained [mscorlib]System.DateTime
L_000f: callvirt instance string [mscorlib]System.Object::ToString()
Automating the translation of expressions to IL is where this library comes in. Essentially, it does the same thing as a C# compiler except that it compiles at runtime, emits its IL to memory instead of an assembly, and is designed to work with expressions (which are a subset of a complete programming language).
Walkthrough: Creating and Evaluating an Expression
For this walkthrough, we are going to compute the hypotenuse of a triangle with sides a and b, using the expression
sqrt(a^2 + b^2). To start, we will need to add a reference to the ciloci.Flee.dll and import the
ciloci.Flee namespace. First, we need to declare our expression owner. This is a class to which the dynamic method generated by the expression will be attached. Once attached, the dynamic method will behave as a static function on the class and will have access to all of its members (even non-public ones). We will declare a class and give it two fields to represent the a and b sides of the triangle:
public class ExpressionOwner
public int a;
public int b;
To be able to use the
sqrt function in our expression, we have to import the
Math class. Importing determines what members, besides the ones on the owner, are visible to an expression. By default, no types are imported, and references to types using their fully-qualified names are not allowed. This is a security feature: since expressions are compiled and evaluated at runtime, and could be entered by users, it would not be a good idea to allow them to call arbitrary methods on any type loaded into the program. Once we import the
Math class, we can use all of its constants and methods in the expression. We will use the
sqrt function, but we don't need the
pow function since the power operator is built in to the expression language. We will need to create an instance of the
ExpressionOptions class and use its
Imports property to do the actual import:
ExpressionOptions options = new ExpressionOptions();
Next, we create an instance of our expression owner and initialize the
ExpressionOwner owner = new ExpressionOwner();
owner.a = 4;
owner.b = 5;
Now, we create the expression, passing it our owner and the options:
Expression e = new Expression("sqrt(a^2 + b^2)", owner, options);
catch (ExpressionCompileException e)
There is now a method in memory that contains the code for the expression. The expression has an
Evaluator property that exposes a delegate pointing to this method. The delegate will always be an instance of
ExpressionEvaluator, which is a generic delegate defined in the library. To evaluate our expression, we need to retrieve the delegate and cast it to an
ExpressionEvaluator with the correct return type. Although this seems more work than just returning an object, it ensures that there will be no boxing for expressions that evaluate to value types. Applying the above, we can now evaluate our expression:
ExpressionEvaluator<double> evaluator = (ExpressionEvaluator<double>)e.Evaluator;
double result = evaluator();
That's it! You can now update the fields on the owner, and evaluate the expression again to get the updated result. If you are curious as to what IL was generated by the expression, you can use its
EmitToAssembly() method to have it dump the IL to an assembly on disk. You can then use ILDasm (or better yet, Reflector) to inspect it. Here is the IL for our hypotenuse expression:
L_0001: ldfld int32 [ConsoleTester]ConsoleTester.ExpressionOwner::a
L_0009: call float64 [mscorlib]System.Math::Pow(float64, float64)
L_000f: ldfld int32 [ConsoleTester]ConsoleTester.ExpressionOwner::b
L_0017: call float64 [mscorlib]System.Math::Pow(float64, float64)
L_001d: call float64 [mscorlib]System.Math::Sqrt(float64)
Note how the
b fields, which are integers, were implicitly converted to
doubles for the
^ operator, and how the
^ operator was translated to a call to the
Pow function. The IL stream above is only 35 bytes long, and will be JITed to native processor instructions. Compare this with interpretive evaluators that have to go through several function (and possibly Reflection) calls and
if statements to evaluate the same expression.
There is also a demo application that shows a practical usage of the library. The concept was inspired by Pascal Ganaye's demo for his expression evaluator. Essentially, the demo generates an image by evaluating expressions for the red, green, and blue components where the expressions can reference the x and y co-ordinates of the current pixel. This is, basically, a color version of plotting a graph for a function. Using the demo will give you a good sense of how fast expressions can be evaluated. Even on my outdated computer at home, I still manage over a million evaluations per second. This means that, on a modern computer, you could generate an 800x600 image in under a second.
The Expression Language
The expression language that this library parses is a mix of elements of C# and VB.NET. Since the aim of this library is speed, the language is strongly typed (same rules as C#) and there is no late binding. Unlike C#, the language is not case-sensitive. Here is a breakdown of the language elements:
100 + a
|*, /, %||Multiplicative|
100 * 2 / (3 % 2)
2 ^ 16
-6 + 10
"abc" + "def"
0x80 >> 2
|=, <>, <, >, <=, >=||Comparison|
2.5 > 100
|And, Or, Xor, Not||Logical|
(1 > 10) and (true or not false)
|And, Or, Xor, Not ||Bitwise|
100 And 44 or (not 255)
If(a > 100, "greater", "less")
|Cast||Cast and conversion|
1 + arr[i+1]
true AND false
|Real literal||Double and single|
100.25 + 100.25f
|Integer literal||Signed/unsigned 32/64 bit|
100 + 100U + 100L + 100LU
0xFF + 0xABCDU + 0x80L + 0xC9LU
The library is licensed under the LGPL. This means that as long as you dynamically link (i.e., add a reference) to the assemblies from the official releases, you are free to use the library for any purpose.
Implementing an expression evaluator requires two things: a parser that converts an expression string into a syntax tree, and a compiler to take the tree and convert it into IL.
I chose not to write my own parser, and used Grammatica instead because it is written by a guy who specializes in parsing and language theory. Grammatica is a parser generator: a program that takes a grammar (set of rules for a language) and outputs a class that will parse it. Writing grammars for it is not difficult if you are comfortable with recursion and Regular Expressions. It has great error handling, and will report any invalid grammatical constructs. The source code package for this library includes the grammar file for the expression language, so you can see how it all works. Grammatica is a great tool if you want to do parsing that requires more power than Regular Expressions can provide. Don't be fooled by its alpha status, it is solid as a rock, and has never given me problems.
Once I wrote the grammar for the expression language, I had Grammatica generate a parser for me. The parser class has callback methods that I can hook into to let me know when the parser has hit a certain language element. From there on, it's up to me to decide what to do with that element. As it turns out, the best way to represent the language elements is via a tree structure. An expression like
1 + 2 will be represented as an ADD element with two children representing the operands. Every element validates its children as they are set to make sure that they are valid for its particular operation, and throws an
ExpressionCompileException if they are not. As you keep parsing, the tree gets built up until you reach the end of the expression and you are left with one root node representing the whole expression. At this point, we have a valid expression and its compiled representation; we are finished with parsing and have to move on to code generation.
We now have a tree consisting of various operators and operands that we have to translate to IL. Every element in our tree derives from the
ExpressionElement class, which requires that the element be able to emit IL. From here, the main challenge is to emit IL that is as compact as possible and follows the type rules of the CLR. This means that you always have to watch for whether you are working with a value type or reference type. The IL emitted for the expression
mystring.func() is different than the IL for
myint.func(). To generate the IL for our method, we let the root element emit its IL and the IL of all its children. We then call the
CreateDelegate method on our
DynamicMethod to "burn" the code into memory and get a delegate that points to it. From here, the method's code is set and cannot be changed.
One thing that is essential for a project like this is unit testing. As you tweak the grammar for your expression, you have to make sure that it doesn't invalidate previously valid expressions and that the result of evaluating an expression is correct. Fortunately, this type of project lends itself perfectly to unit testing. All you have to do is create a test consisting of a text file with an expression on each line and a loop that feeds the expression into the evaluator. If you don't get an exception and the result is correct, the test passes. As you build up your language and add new elements, you keep adding to your text file, so that at any point, you can be sure that you haven't broken any previous expressions.
The idea for this project came from the lessons I learned from my previous project. It turns out that people mainly want to be able to define variables and evaluate them in expressions. The previous project lets them do that, but since it is Excel compatible, it has to evaluate expressions in a particular way and support a more dynamic type system. For this project, I wanted to remove those limitations and try to create a fast and efficient expression evaluator. I also wanted to try my hand at building a compiler since it's something every programmer should do. I look forward to your feedback, and plan on actively maintaining and improving this library based on it. Thanks!
- Jul 26, 2007: Initial article publication.
- Jul 30, 2007: Revisions to make article better.
- Aug 19, 2007: Updates for dynamic variables and
- Aug 26, 2007: Added paragraph about ExpressionOptions.
- Oct 11, 2007: Added operator precedence table.