Click here to Skip to main content
Email Password   helpLost your password?
Using the engine to evaluate an expression

Introduction

FormulaEngine is a .NET assembly that enables you to add formula support to your application. It takes care of parsing and evaluating formulas, tracking their dependencies, and recalculating in natural order. The formula syntax and much of the engine's functionality are direct replicas of Excel ensuring a low learning curve for users. The library is licensed under the LGPL and the project is hosted here on SourceForge.

Features

Here's a list of the engine's major features:

Motivation

I wrote this library for the following reasons:

Yet another expression evaluator?

Seeing that expression evaluators are very popular here on CodeProject, what makes this one different? The two main differences are that this library implements many features found in Excel and that it does more than just evaluate expressions.

Overview

In this article, I will give a brief overview of the three major things that this library enables you to do:

Formula parsing and evaluation

The first thing this library allows you to do is evaluate formula expressions. The supported syntax is based on Excel and 95% of existing formulas should be able to be used without any modification. The engine provides the Formula class, which represents a compiled formula expression. You call the engine's CreateFormula method with an expression and it will return a Formula instance that you can evaluate:
' Create an instance of the engine

Dim engine As New FormulaEngine
' Create a formula

Dim f As Formula = engine.CreateFormula("=sqrt(3^2 + 4^2)")
' Evaluate the formula to get a result (5.0)

Dim result As Double = f.Evaluate()

The method will throw an InvalidFormulaException exception if it cannot create a formula from the expression. This is usually (but is not limited to) due to a syntax error in the expression. The inner exception of the thrown exception will have more details.

The engine also has the ever-popular Evaluate method for when you quickly want to evaluate an expression. Let's try to evaluate the "mega" formula found here:

' Create an instance of the engine

Dim engine As New FormulaEngine
' Assume cell A1 contains "http://j-walk.com/ss/books"

' Call Evaluate to get a result: "books"

Dim result As String = engine.Evaluate("=RIGHT(A1,LEN(A1)-FIND(CHAR(1),_
                                       SUBSTITUTE(A1,""/"",CHAR(1)" _
                              & ",LEN(A1)-LEN(SUBSTITUTE(A1,""/"","""")))))")

Data types and calculation

The engine supports the following data types when evaluating expressions: Integer, Double, String, Boolean, DateTime, Null, Error values, and References. Just like Excel, operands are loosely typed meaning that any data type is valid as long as it can be converted to the desired data type. For example: the expression ="123" + 10 is valid since the string "123" can be converted to a number. One major difference from Excel is that DateTime values are not treated as numbers. If you want to add/subtract dates, you will have to use a function.

When an error is encountered during formula evaluation, an ErrorValueWrapper instance will be returned. This class wraps one of the seven Excel error values and allows you to get the specific error as well as format it.

Formulas and the result type property

The formula class has a property on it called ResultType that allows you to specify the desired type of the formula's result. This is useful when you have an expression like =A1 which could validly evaluate to either the contents of cell A1 or a reference to it. By setting the result type you can control which of the two results you get. The formula will attempt to convert its result to the specified type. If the conversion is not possible, then the #VALUE! error will be returned.
Dim f As Formula = engine.CreateFormula("=A1")
' Make the formula evaluate to any value except a reference

f.ResultType = OperandType.Primitive
' result will be the contents of cell A1

Dim result As Object = f.Evaluate()
' Make the formula evaluate to a sheet reference

f.ResultType = OperandType.SheetReference
' result will be a reference to cell A1

result = f.Evaluate()

Natural order recalculation

The second thing the library allows you to do is natural order recalculation. For those unfamiliar with the term, recalculating in natural order means that a formula is recalculated after any formulas that it depends on. Consider the a worksheet with the following values and formulas:
(A1): 15
(B2): =A1 + 10
(C1): =A1 + B2
(D2): =C1 * 2
When the contents of cell A1 change, the three formulas need to be recalculated. The formula at B2 must be recalculated first since it only depends on A1. The formula at C1 is recalculated second since it depends on the value of B2. Finally, the formula at D2 is recalculated last since it depends on C1.

For the engine to be able to recalculate in natural order, it must keep track of the dependencies between formulas. It does this by acting as a container for formulas. As formulas are added to the engine, their dependencies are analyzed and a dependency graph is built. You then tell the engine to recalculate and it will use the graph to build up a calculation list, sort it in natural order, and recalculate each formula.

References

The basic unit that the engine uses to track dependencies is the reference. There are various types of references, they all implement the IReference interface, and the ReferenceFactory class creates them all. When you add a formula to the engine, you need to specify a reference that the formula will be bound to. The formula will then "live" at that reference. By changing the type of reference you bind the formula to, you can change how that formula is referenced from other formulas. For example: by binding a formula to a named reference, you allow other formulas to reference it by using a name.
' Associate the name Root2 with a formula

engine.AddFormula("=sqrt(2)", engine.ReferenceFactory.Named("Root2"))
' Use the name in an expression (result is 2.0)

Dim result As Double = engine.Evaluate("=root2 ^ 2")

Now that we've seen how the engine tracks dependencies, let's see how the initial example above would be set up using code:

' Assume we've already added a worksheet to the engine

' Add a formula at B2

engine.AddFormula("=A1 + 10", engine.ReferenceFactory.Cell(2, 2))
' Add a formula at C1

engine.AddFormula("=A1 + B2", engine.ReferenceFactory.Parse("C1"))
' Add a formula at D2

engine.AddFormula("=C1 * 2", engine.ReferenceFactory.Parse("D2"))

Our engine now contains 3 formulas and a graph describing their dependencies. All we need to do is tell the engine that a reference has changed and that all its dependents need to be recalculated. We do this using the Recalculate method:

' Create a reference to cell A1

Dim a1Ref As ISheetReference = engine.ReferenceFactory.Parse("A1")
' Recalculate all dependents of A1

engine.Recalculate(a1Ref)

Custom functions

The last thing the engine allows you to do is define your own functions for use in formulas. To do this we must use the FunctionLibrary class, which is accessible through a property on the engine. The extensibility mechanism I used is based on delegates. I felt that this makes it easier to add many functions because you don't have to define a new class for each function as with the alternative interface/subclass based mechanism. It also allows the engine to use reflection to add all the methods of a class in bulk. Defining a custom function requires three steps: Let's define a function that returns the length of the hypotenuse given the length of the other two sides: First we must define a method with the correct signature:
Public Sub Hypotenuse(ByVal args() As Argument, ByVal result As FunctionResult,_
                      ByVal engine As FormulaEngine)

End Sub

Explanation of the three arguments:

Second, we have to adorn our method with the proper attribute so that the function library can recognize it:
<FixedArgumentFormulaFunction(2, New OperandType() {OperandType.Double, _
 OperandType.Double})> _
Public Sub Hypotenuse(ByVal args() As Argument, ByVal result As FunctionResult,_
                      ByVal engine As FormulaEngine)

End Sub
We have now declared our method as requiring 2 arguments, both of type Double. The engine will only call our method if exactly 2 arguments were specified in the formula and both arguments can be converted to Double. This frees us from having to write argument validation code for every function we want to implement.

Finally, we must write the actual implementation of our function:

<FixedArgumentFormulaFunction(2, New OperandType() {OperandType.Double, _
 OperandType.Double})> _
Public Sub Hypotenuse(ByVal args() As Argument, ByVal result As FunctionResult,_
                      ByVal engine As FormulaEngine)
    ' Get the value of the first argument as a double

    Dim a As Double = args(0).ValueAsDouble
    ' Get the value of the second argument as a double

    Dim b As Double = args(1).ValueAsDouble
    ' Compute the hypotenuse

    Dim hyp As Double = System.Math.Sqrt(a ^ 2 + b ^ 2)
    ' Set the function's result

    result.SetValue(hyp)
End Sub

We get the value of each of our arguments as a double, compute the hypotenuse, and set the value into the FunctionResult.

Next we add our custom function to the library:

engine.FunctionLibrary.AddFunction(AddressOf Hypotenuse)

And now we can use it in a function:

dim result as Double = engine.Evaluate("=10 + Hypotenuse(3, 4)")

Please note that all functions must return a value and you cannot define/undefine functions while formulas are defined.

Demo application

The demo application is a poor man's version of Excel. It is meant to be a reference implementation showing how to use all of the engine's features. It shows off the following:

Implementation details

Formula Parsing

To implement parsing of formulas, I used the excellent Grammatica parser generator. I wrote a grammar describing the syntax of a formula and let Grammatica generate a parser. I then let the parser parse, listen to callbacks, and fill out the parse tree with my own objects. At the end, I have a root element representing the entire parse tree of the formula. I re-arrange the tree into postfix form and save that into a formula instance. Evaluating a formula then simply consists of iterating through each element and having it push/pop values off a stack. In the end, there should be one value left on the stack, which is the formula's result.

I chose Grammatica because it has a clean separation between grammar and parser code, has easy to use grammar syntax, and it outputs VB .NET code. The project hasn't had any activity in a while but it is not dead and even though the version I'm using is an alpha, I found it to be very stable: no crashes and no incorrect functionality. I highly recommend it if you, like me, are new to parsers and grammars.

Also, since the grammatica parser is created at runtime, it is very easy to dynamically change the decimal and argument separator patterns to use the values of the current culture. This means that instead of =1.2 + sum(1,2,3), a user in Estonia can enter =1,2 + sum(1;2;3) and have it be a valid formula.

Natural order recalculation

The engine maintains a dependency graph for all formulas. When a recalculate is required, a temporary graph is built starting at the root node of the recalculate. Once all circular references are removed, a topological sort is performed on the graph to get a list of formulas in natural order. With this calculation list, it is simply a matter of iterating through it and re-evaluating each formula.

Not Implemented

The following things are not implemented because they are obscure or advanced features that most people won't know about or find useful: I will certainly look into implementing them if there is enough demand.

Conclusion

I found that coding this project gave me lots of insight into how Excel works. Having to implement 100+ Excel functions makes you very familiar with all their little quirks. For example: the formula =Offset(A1,1,1) entered into cell A1 is not treated as a circular reference by Excel even though it depends on its own cell. Another example is that the concatenate formula does not work with non-cell ranges. Whereas you can say =Sum(A1:B2), you cannot say =Concatenate(A1:B2).

This project is currently in the alpha phase because it hasn't had any real-world testing/usage. As mentioned in the introduction, the project is hosted on SourceForge. Any bugs or feature requests should be reported there using the provided tools. Any new releases will be also posted there.

Well, I hope you guys find this project useful!

History

You must Sign In to use this message board.
 
 
Per page   
 FirstPrevNext
Generalproblem with named reference dependencies
DaveJones
1:31 9 Oct '09  
Hi Eugene,

I am using formula engine 0.9.2 in my application, and I've stumbled upon a strange case.

Suppose I have 3 named references: A, B and C, with the following formulas: A = 10, B = 1 + A and C = 2 * B.

When evaluating formula C, I get an error or not, depending on the order I am adding the formulas into the engine. For example if the order is A, B, C everything works OK, while if the order is B, A, C the evaluation of formula C will fail with a #NAME error. Please find the code to reproduce the error below:

FormulaEngine engine = new FormulaEngine();

INamedReference refB = engine.ReferenceFactory.Named("B");
Formula formulaB = engine.CreateFormula("1 + A");
engine.AddFormula(formulaB, refB);

INamedReference refA = engine.ReferenceFactory.Named("A");
Formula formulaA = engine.CreateFormula("10");
engine.AddFormula(formulaA, refA);

INamedReference refC = engine.ReferenceFactory.Named("C");
Formula formulaC = engine.CreateFormula("2 * B");
engine.AddFormula(formulaC, refC);

Console.WriteLine("Dependencies : " + Environment.NewLine + engine.Info.DependencyDump);
Console.WriteLine();

object res = formulaA.Evaluate();
Console.WriteLine("Formula A = " + res);

res = formulaB.Evaluate();
Console.WriteLine("Formula B = " + res);

res = formulaC.Evaluate();
Console.WriteLine("Formula C = " + res);

Console.ReadKey();

Please tell me how can I solve this problem since in my application, which is more complex, I cannot control the order the formula references are added to the engine.

Best Regards,
Mihai
GeneralRe: problem with named reference dependencies
DaveJones
1:44 9 Oct '09  
I just found out that using
engine.Recalculate(refA); would make the previous code work.

However, my question still stands, since in my application I cannot determine which is the root reference.
GeneralRe: problem with named reference dependencies
DaveJones
1:57 9 Oct '09  
I came up with a solution that looks like a RecalculateAll:

foreach (INamedReference namedRef in engine.GetNamedReferences()) {
engine.Recalculate(namedRef);
}

GeneralPlease help
marketgeek
10:29 26 Jun '09  
I need to be able to do the following:

input: "MAX(4,MIN(5,6)) for 4 MIN(10,12) for 2"

output: "5 5 5 5 10 10"

where "for" indicates repeat the result of the expression for number of times indicated by the value on the right.

I know I can add custom functions like MIN and MAX but "for" in here is acting more like operator. Can I add that too? "for" is just one of the many keywords I would need to add.

Thanks for your help.
GeneralGrammar
6opuc
4:06 13 Mar '09  
Great work!

Could You please share your grammar file for Grammatica?
AnswerRe: Grammar
Eugene Ciloci
9:05 14 Mar '09  
The grammar is included in the source packages (look for the .grammar files).
GeneralRe: Grammar
6opuc
19:28 15 Mar '09  
Yes, found it!

Thanks a lot!
GeneralBug in range reference handling [modified]
YvanA
7:18 5 Jan '09  
If you set:

A1 <- =C2
B2 <- =A1+SUM(A1:E1)

Then you get an exception "head already in list" (assert from DependencyManager.AddDependency).

This is due to range reference optimization: the graph is, step by step, as follows

C2 -> A1

then

C2 -> A1 -> B2 (processing the A1 term in B2)

then

C2 -> A1 -> B2 <- A1:E1 (processing the A1:E1 term in B2)

Then, the method AddRangeLinks adds an edge to B2 for each non-source
intersecting A1:E1 ie it adds the edge A1 -> B2 which is already present => Exception.

An easy solution would be to remove the A1 dependency of the formula (because we have it in A1:E1)
but what to do with intersecting ranges e.g B2 <- = SUM(A1:A2) + SUM(A1:E1) ??

Removing the assertion generates double edges:
Sheet1!A1 -> Sheet1!B2, Sheet1!B2
Sheet1!C2 -> Sheet1!A1
Sheet1!A1:E1 -> Sheet1!B2
and seems to work fine.

[EDIT Typo]

modified on Monday, March 16, 2009 6:35 AM

GeneralRecalculateAll() ?
Member 573587
7:03 30 Sep '08  
Hi Eugene,
Firtly, thank you for an excellent tool. My application will load quite a lot of data that will be only sparsely populated with formulae. I would like to load all the data and then do a single "RecalculateAll()" which would only recalc references that have dependents. Would it be possible to add such a method to the engine? My (brief) inspection of the code suggests that one would only need to recalculate references that are held as the dictionary keys in the dependency map - but I haven't found a way to get access to the map other than via the DependencyDump.
Regards
Peter
GeneralFormula Engine + Bussiness Objects for Payroll Rules [modified]
ARUVAL
11:05 4 Sep '08  
Hi Eugene,
Can you please advise on how to integrate a bussiness object list with the engine.
i whish to Implement this in order to Bussiness object like an argument and use a function like this

BOLookup(MyBO,MyBO.FieldName) kind like the LOOKUP function already implemented

Should I Implement the ISheet for a Bussiness Object List

any toughts will be appreciated

Best Regards
Arturo Ruvalcaba Gómez

modified on Thursday, September 11, 2008 12:33 AM

GeneralImplement in VBA macro
takedownca
18:25 31 Aug '08  
To build on Rashul's question from 8-18-08, how (if possible) do I implement this in VBA? I'm just an engineer that likes to write macros for SolidWorks, so my programming skills aren't phenomenal. Currently, I can't even get off the ground because the VBA editor won't let me add the dll as a reference. Does the dll not have an interface for VBA? Does a wrapper need to be written? If so, any luck of that happening <hint,>? This looks to be a very useful library if I can use it Smile
GeneralUse with Excel VBA
rashul
23:51 17 Aug '08  
I am still a little unclear on how to use the engine.Will i need to code in VB.net .
Can i use the engine with excel and VBA ? Can you tell me on how to go about it ?
Thanks !!!!
GeneralIntegrate with databound control or a .NET datatable
kadongre
7:31 15 Aug '08  
How/What is required to have the Formula engine integrated with databound grid or a datatable.
My data is stored in a datatable that is retrieved from a web service and is displayed on a 3rd party .NET grid control.
How would I accomplish supporting formula engine in this scenario. Do i need to implement the ISpreadSheet interface on the DataTable or on the Grid control. Also how do I handle the formula persistance and the calculated value display to the users in this scenario?
GeneralRe: Integrate with databound control or a .NET datatable
Eugene Ciloci
18:36 17 Aug '08  
You would implement the ISheet interface on the data grid control. The engine will query cell values through the ISheet interface and it is up to you to get the value from the underlying control.

Similarly, when a formula is calculated, the engine will use the interface to give you a value (object) and the row/cell of where to place the value in your control. You store the value and use the value's ToString() method to get the display text.

Formula persistence is handled through .NET serialization.

The best thing to do is look at the sample application which does pretty much exactly what you need to do.
GeneralCircular references in formulas
Member 4494123
0:37 11 Jul '08  
Hi Eugene,

Good project you have here. I am in the progress of porting an Excel spreadsheet calculation program to ASP.NET. There is one case in the calculation where I have two variables which depend on eachother so they have a circular reference in formulas.
Do you have a hint how I could workaround this? In Excel the results are OK but I don't know how to convert this to ASP.NET code.

Any help greatly appriciated.

Marc
info@fit2page.nl
GeneralRe: Circular references in formulas
Eugene Ciloci
16:24 11 Jul '08  
You can just add the two variables to the engine as normal (you can try this in the demo app). It will note that they have a circular reference and when you do a recalculate, the engine will raise the CircularReferenceDetected event when it hits one of the two variables. The event is for informational purposes (ie: tell the user that they have a circular reference). After the event is raised, the engine will ignore the two variables and do the rest of the recalculate.
QuestionHow To Implements "Array/Matrix formulas"
jmptrader
21:27 6 Jul '08  
Eugene,
How To can implement this functionality like Excel ?
Example:

Call:

={BondCashFlows(A1)}

Returns:

01/12/2008 1.0000 2.000 3.0000 LIVE
01/12/2009 2.0000 2.000 4.0000 LIVE

(implicit record types are: date,double,double,string)

Please helpme,
Thanks,
Jorge
AnswerRe: How To Implements "Array/Matrix formulas"
Eugene Ciloci
16:17 11 Jul '08  
Unfortunately, I decided not to support array/matrix formulas for this project.
GeneralSource does not function...
dherrmann
5:46 24 Apr '08  
Hallo,
as a beginner of .net i have some problems with your solution. but first i must say, that i am very interested on it, because i will use it in a commercial project.
i use VB2008Express and i had unzipped all, then i started the demo which functioned good. but when i try the source-projects (there are 4..) then i get 57 errors...
there are such as:
- type "Test" not defined
- name "Assert" not declared
and so on...

i tried to find somme information about which references i have to set, but found nothing.

can you please tell me, how to get it going?

greetings from austria
Dietrich
GeneralRe: Source does not function...
Eugene Ciloci
7:11 25 Apr '08  
Sounds like you don't have NUnit installed. You can either install it and then add a reference to nunit.framework.dll in the Tests project or just remove the Tests project from the solution.
GeneralRe: Source does not function...
dherrmann
14:07 25 Apr '08  
thank you for the answer. i have installed NUnit and i have removed the tests project. the reference to NUnit is set.
now i have the following message at start (i translate it from german for you):

a project with type of "class library" can not be started.

i tried to start gridsample project.

Dietrich
GeneralRe: Source does not function...
Eugene Ciloci
12:47 27 Apr '08  
Sounds like the grid sample project is not set as the start-up project. You'll have to right-click on the grid sample project and pick "Set as start-up project" and then it should start up.
GeneralRe: Source does not function...
dherrmann
0:42 2 May '08  
that was it- thank you for the tipp.
i will try the whole thing now...

greetings-
dietrich
QuestionGetting notified of dependency value changing [modified]
cobb_michael
7:56 31 Jan '08  
Hi
Congrats on a great project!
I have read the long thread with Jeremy Fuller, so I believe I understand how to implement this engine without really tying to a spreadsheet-like object (by using Variables, NamedReferences and Formulas and the General GrammarType).
My question is this: When a value is changed (on a Variable or Formula), how do you get the updated values for the Formulas/References that depend on this Variable? The only examples I've seen so far require you to call Recalculate on the Variable, and then to access the Result property on the References.
Isn't there some way to catch an event on all References that get their values changed, so you can realize their values were updated and handle it accordingly? Or is there some way when a Variable value is changed to get a list of dependent Formulas/References, so you can call the Result property on just those objects, instead of checking every single object in the engine?
Thanks

modified on Thursday, January 31, 2008 4:46:02 PM

QuestionRe: Getting notified of dependency value changing
cobb_michael
11:46 31 Jan '08  
I've noticed that each of the different types of Reference objects handles formula recalcs slightly differently (OnFormulaRecalculate).

CellReference:
Dim result As Object = target.Evaluate()
Me.Sheet.SetFormulaResult(result, MyRowIndex, MyColumnIndex)

NamedReference:
MyValueOperand = target.EvaluateToOperand()
ExternalReference
MyResult = target.Evaluate()
RaiseEvent Recalculated(Me, EventArgs.Empty)

Is the most elegant way of handling my issue (getting notified when NamedReference values get updated due to changes in any Variables/NamedReferences that it depends on) just to raise an event in the NamedReference class? Maybe each of these Reference objects should raise an event in OnFormulaRecalculate?


Last Updated 17 Mar 2007 | Advertise | Privacy | Terms of Use | Copyright © CodeProject, 1999-2010