Well, the general idea is to give you the possibility to write something like this:
Func<Order,bool> filterOrders =
XXX("not(Supplier.City<>\"London\" or Supplier.Status>15)");
And then use it in any expression which requires
Implementation of convertor from
strings like "
not(Supplier.City<>\"London\" or Supplier.Status>15)" to
Func<Order, bool> is what this article about.
The idea to build such engine came when I was developing a desktop application for goods and customers management. There were a lot of artifacts like customer, good, order, etc., which were presented in table form for the user and he can manipulate them. The important part was that the user should have the possibility to setup custom filters, sort items according to his own criterias and add special checking conditions like if this good's shelf life is about end, then mark it by different color, etc.
When creating this application, I decided to write some universal engine which will allow me to construct predicates and sorting criterias from
string expression. It was .NET 2.0 and no Linq, so I built my own
string expression parser, and interpreter builder (I used Interpreter pattern). To access object's properties and functions, I used reflection and it all worked just great... With one exception: it was a bit slow because of reflection. I bit because I spent some time optimizing it.
Now with .NET 4.0 and Linq Expressions, I decided to refactor it and build Linq Expressions. Basically, what it changed is that I got rid of Interpreter Pattern (Linq Expression instead of it). And, what is more important: you can Compile expression, so performance is very good and no reflection.
I use C# generic collections extension methods and Linq Expressions to demonstrate the result.
For engine itself, I use runtime Expressions construction and compilation (
Expression.Lambda.Compile method). I use reflection during expression construction for search of object's properties and methods and types conversion. Of course I use generics.
All parsing of expression string and built of interpreter is hand made, no special knowledge required.
Under functor in this article I mean generic delegates:
Func<T1,T2,TResult>, etc. The following example describes a simple use of functor:
Func<int,bool> func = a => a + 1 == 7;
int arg0 = 6;
Now you see that to use functor, you have to pass some argument to it. In C#,
Func is declared for up to 4 arguments. Functor refers to passed arguments to calculate result.
Finally, problem definition sounds like this:
Construct functors Func<...,TResult> from string expression satisfying some grammar.
My solution to the problem described above is library, which allows creation of Linq Expressions and Functors from
One of the ideas was to keep expression grammar user friendly, so that user shouldn't think about types of variables, constants, etc. However, for developers expressions provide possibility to indicate argument, make type conversions, etc. With all this, I do not pretend to build full-scale compiler, so I omit such operators as ?: or ??. No "
else" or looping is possible. Oh yes, Nullable types are supported!
Approximate grammar (similar to C#, but not 100%):
const = any valid constant expression,
strings can be taken in ""
argN //case insensitive
bin_operator = +|-|*|/|=|<|<=|<>|>|>=|and|or|xor //case incensitive
un_operator = not //case incensitive
DateTime //case insensitive
expression bin_operator expression
expression, ... ,
Generation is done in several steps:
String expression is parsed by
ExpressionParser class and list of Tokens created
- Tokens are analized by
ExpressionBuilder class and resulting Linq Expression is created
- Linq Lambda function is generated from expression and compiled to
Along with building parser, which is fairly simple, and compiler (builder) which is much more complex, I faced several problems described below.
Types (solved partially)
To build Linq Expression, I need to supply specific types of arguments to it. For example, when you call
Expression.And(Expression arg1,Expression arg2) method,
arg2 should be of type
Expression.Add(Expression arg1,Expression arg2) method,
arg2 can be of any type, but should have + operator defined.
When operand obtained from
Function and is not base type I cannot do anything with it, so just throw exception (and I think it is correct handing). However, when there is a constant (especially when constants are from both sides of operator like 3 >= 4.5) I should guess what type to select.
I don't like the current solution, but looks like it works for most cases: I just probe every basic type starting from
float, etc. for both operands. In the mentioned case, the engine decided to use
float as 4.5 is
float while 3 is
int and can be converted to
Function Calls (solved partially)
When expression contains function call, arguments for function are passed in
(). Engine checks number of arguments and argument types and if anything does not match, throws exception. However, currently it doesn't take into consideration functions overloading and as a result valid expression construction may fail. This is because arguments types are not analyzed before
MethodInfo for function obtained. It is partially related to the problem with Types described above.
Finally it can be solved, but requires more work.
Using the Code
Project is in .NET 4.0 and C#, but can be compiled in .NET 3.0 with small corrections (default parameters in some constructors you'll need to replace by second constructor).
For all samples in the demo project, I use several classes related to each other:
Order entity describes number of specified
Details ordered from specified
Func<Order,bool> londonFilter =
"Project.City=\"London\" and Detail.City=Project.City");
Func<Order,string> ordStrConv =
List<string> privelegeSuppliers =
Func<Supplier, Supplier, int> suppsorter =
ExpressionBuilder.BuildFunctor<Supplier, Supplier, int>("Status-arg1.Status");
RequestEngineTest project, you can find more interesting examples of usage.
Points of Interest
I definitely learned much more about Linq and Expressions while creating this library. From the beginning, the final solution was not supposed to be so integrated with built in Expressions and collections extensions. I was actually surprised to see that it integrates well.
There is a class
ListSegment<T> in the library, which I think could be useful to refactor and include to generic collections library. It is analogue of
ArraySegment<T>, but much more useful.
In general, what it does is transparently wraps
List<T> to provide access to only part of its elements. As it was not my main purpose, I haven't finished it and there's still work to do.
What is Different?
You may ask why not to use some script engine like Jint?
The answer is: each tool is best for its purpose. My purpose was to have user friendly grammar to input own criterias for sorting, filtering, etc. I think I got it.
Performance is the second answer. Because Expression is finally compiled to IL code and native code, its performance is much better than that of scripting engine.
History & Future
I used the previous version of
ExpressionsGenerator (built on Interpreter pattern and Reflection) quite successfully more than 1 year ago and plan to use this new version in future.
I plan to support some more things like:
Static Method calls
- Improved Function call
- Maybe add
- Add building of
string expression, so that it was possible to use custom comparers for sorting (was in previous version)
- Refactor: separate
Expression by using
AbstractFactory pattern (will allow to build other constructs based on the same grammar)
- Refactor: Add possibility to add new operators easier
- Review work with types
Oh yeah, it's a lot of work, but since I plan to use it maybe will do step by step.