![]() |
Languages »
VB.NET »
Parsers and Interpreters
Intermediate
License: The GNU Lesser General Public License
Implementing an Excel-like formula engineBy Eugene CilociA library to parse and evaluate Excel-style formulas and recalculate in natural order |
VB 8.0.NET 2.0, Win2K, WinXP, WinForms, VS2005, Dev
|
|
Advanced Search Add to IE Search |
|
|
||||||||||||||||||
Formula class, which represents a compiled formula expression. You call the engine's CreateFormula method with an expression and it will return a Formula instance that you can evaluate:
' Create an instance of the engine
Dim engine As New FormulaEngine
' Create a formula
Dim f As Formula = engine.CreateFormula("=sqrt(3^2 + 4^2)")
' Evaluate the formula to get a result (5.0)
Dim result As Double = f.Evaluate()
The method will throw an InvalidFormulaException exception if it cannot create a formula from the expression. This is usually (but is not limited to) due to a syntax error in the expression. The inner exception of the thrown exception will have more details.
The engine also has the ever-popular Evaluate method for when you quickly want to evaluate an expression. Let's try to evaluate the "mega" formula found here:
' Create an instance of the engine
Dim engine As New FormulaEngine
' Assume cell A1 contains "http://j-walk.com/ss/books"
' Call Evaluate to get a result: "books"
Dim result As String = engine.Evaluate("=RIGHT(A1,LEN(A1)-FIND(CHAR(1),_
SUBSTITUTE(A1,""/"",CHAR(1)" _
& ",LEN(A1)-LEN(SUBSTITUTE(A1,""/"","""")))))")
="123" + 10 is valid since the string "123" can be converted to a number. One major difference from Excel is that DateTime values are not treated as numbers. If you want to add/subtract dates, you will have to use a function.
When an error is encountered during formula evaluation, an ErrorValueWrapper instance will be returned. This class wraps one of the seven Excel error values and allows you to get the specific error as well as format it.
ResultType that allows you to specify the desired type of the formula's result. This is useful when you have an expression like =A1 which could validly evaluate to either the contents of cell A1 or a reference to it. By setting the result type you can control which of the two results you get. The formula will attempt to convert its result to the specified type. If the conversion is not possible, then the #VALUE! error will be returned.
Dim f As Formula = engine.CreateFormula("=A1")
' Make the formula evaluate to any value except a reference
f.ResultType = OperandType.Primitive
' result will be the contents of cell A1
Dim result As Object = f.Evaluate()
' Make the formula evaluate to a sheet reference
f.ResultType = OperandType.SheetReference
' result will be a reference to cell A1
result = f.Evaluate()
(A1): 15
(B2): =A1 + 10
(C1): =A1 + B2
(D2): =C1 * 2
When the contents of cell A1 change, the three formulas need to be recalculated. The formula at B2 must be recalculated first since it only depends on A1. The formula at C1 is recalculated second since it depends on the value of B2. Finally, the formula at D2 is recalculated last since it depends on C1.
For the engine to be able to recalculate in natural order, it must keep track of the dependencies between formulas. It does this by acting as a container for formulas. As formulas are added to the engine, their dependencies are analyzed and a dependency graph is built. You then tell the engine to recalculate and it will use the graph to build up a calculation list, sort it in natural order, and recalculate each formula.
IReference interface, and the ReferenceFactory class creates them all. When you add a formula to the engine, you need to specify a reference that the formula will be bound to. The formula will then "live" at that reference. By changing the type of reference you bind the formula to, you can change how that formula is referenced from other formulas. For example: by binding a formula to a named reference, you allow other formulas to reference it by using a name.
' Associate the name Root2 with a formula
engine.AddFormula("=sqrt(2)", engine.ReferenceFactory.Named("Root2"))
' Use the name in an expression (result is 2.0)
Dim result As Double = engine.Evaluate("=root2 ^ 2")
Now that we've seen how the engine tracks dependencies, let's see how the initial example above would be set up using code:
' Assume we've already added a worksheet to the engine
' Add a formula at B2
engine.AddFormula("=A1 + 10", engine.ReferenceFactory.Cell(2, 2))
' Add a formula at C1
engine.AddFormula("=A1 + B2", engine.ReferenceFactory.Parse("C1"))
' Add a formula at D2
engine.AddFormula("=C1 * 2", engine.ReferenceFactory.Parse("D2"))
Our engine now contains 3 formulas and a graph describing their dependencies. All we need to do is tell the engine that a reference has changed and that all its dependents need to be recalculated. We do this using the Recalculate method:
' Create a reference to cell A1
Dim a1Ref As ISheetReference = engine.ReferenceFactory.Parse("A1")
' Recalculate all dependents of A1
engine.Recalculate(a1Ref)
FunctionLibrary class, which is accessible through a property on the engine. The extensibility mechanism I used is based on delegates. I felt that this makes it easier to add many functions because you don't have to define a new class for each function as with the alternative interface/subclass based mechanism. It also allows the engine to use reflection to add all the methods of a class in bulk. Defining a custom function requires three steps:
FormulaFunctionCall delegate FixedArgumentFormulaFunction or VariableArgumentFormulaFunction attribute Public Sub Hypotenuse(ByVal args() As Argument, ByVal result As FunctionResult,_
ByVal engine As FormulaEngine)
End Sub
Explanation of the three arguments:
Argument instances FunctionResult class where we will store our function's return value <FixedArgumentFormulaFunction(2, New OperandType() {OperandType.Double, _
OperandType.Double})> _
Public Sub Hypotenuse(ByVal args() As Argument, ByVal result As FunctionResult,_
ByVal engine As FormulaEngine)
End Sub
We have now declared our method as requiring 2 arguments, both of type Double. The engine will only call our method if exactly 2 arguments were specified in the formula and both arguments can be converted to Double. This frees us from having to write argument validation code for every function we want to implement.
Finally, we must write the actual implementation of our function:
<FixedArgumentFormulaFunction(2, New OperandType() {OperandType.Double, _
OperandType.Double})> _
Public Sub Hypotenuse(ByVal args() As Argument, ByVal result As FunctionResult,_
ByVal engine As FormulaEngine)
' Get the value of the first argument as a double
Dim a As Double = args(0).ValueAsDouble
' Get the value of the second argument as a double
Dim b As Double = args(1).ValueAsDouble
' Compute the hypotenuse
Dim hyp As Double = System.Math.Sqrt(a ^ 2 + b ^ 2)
' Set the function's result
result.SetValue(hyp)
End Sub
We get the value of each of our arguments as a double, compute the hypotenuse, and set the value into the FunctionResult.
Next we add our custom function to the library:
engine.FunctionLibrary.AddFunction(AddressOf Hypotenuse)
And now we can use it in a function:
dim result as Double = engine.Evaluate("=10 + Hypotenuse(3, 4)")
Please note that all functions must return a value and you cannot define/undefine functions while formulas are defined.
I chose Grammatica because it has a clean separation between grammar and parser code, has easy to use grammar syntax, and it outputs VB .NET code. The project hasn't had any activity in a while but it is not dead and even though the version I'm using is an alpha, I found it to be very stable: no crashes and no incorrect functionality. I highly recommend it if you, like me, are new to parsers and grammars.
Also, since the grammatica parser is created at runtime, it is very easy to dynamically change the decimal and argument separator patterns to use the values of the current culture. This means that instead of =1.2 + sum(1,2,3), a user in Estonia can enter =1,2 + sum(1;2;3) and have it be a valid formula.
=Offset(A1,1,1) entered into cell A1 is not treated as a circular reference by Excel even though it depends on its own cell. Another example is that the concatenate formula does not work with non-cell ranges. Whereas you can say =Sum(A1:B2), you cannot say =Concatenate(A1:B2).
This project is currently in the alpha phase because it hasn't had any real-world testing/usage. As mentioned in the introduction, the project is hosted on SourceForge. Any bugs or feature requests should be reported there using the provided tools. Any new releases will be also posted there.
Well, I hope you guys find this project useful!
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 17 Mar 2007 Editor: Sean Ewington |
Copyright 2007 by Eugene Ciloci Everything else Copyright © CodeProject, 1999-2009 Web17 | Advertise on the Code Project |