Logician: A Table-based Rules Engine Suite in C++/.NET/JavaScript using XML

Eric D Schmidt

Rate me:

4.90/5 (52 votes)

8 Mar 2017GPL314 min read

104.1K

6.8K

173

Overview of a cross-platform spreadsheet-based rules enigne with algorithms implemented in C++ and JavaScript, and wrappers to C#/.NET/WinRT

Introduction

Application logic, particularly business rules, can be messy and time consuming to maintain in code. If all your application logic is hard-coded, it can eventually lead to massive if-then-else or select-case code segments that could grow into huge nightmares. Developers have more important problems to solve and things to do than to maintain a mountain of string compares, boolean tests, or Stored Procedures. There have been numerous attempts at rules engines (a.k.a. "inference engines"), but many of them require the writing of even more cryptic looking code that is hard, if not impossible, for non-developers to maintain. De-coupling business logic from an application certainly makes for more robust and maintainable code, and provides you the ability to let non-developer subject matter experts maintain the data and rules model, provided it is logical and easy to understand. Of course, you can always link parts of your application data to a database, but that still requires a lot of developer work to define the data model and queries needed for every unique "rule-driven" event in your application, not to mention the possibility of database performance bottlenecks in a networked or limited resource environment. A table-based rules engine can be a very powerful and flexible solution for your application logic and automation needs. In a web environment, the Logician JavaScript libraries can also offload a lot of server CPU onto the user's browser and eliminate lagging server callbacks.

Decision Tables

A Decision Table, or "Truth Table" as a mathematician or electrical/computer engineer might call it, is simply a spreadsheet defining the possible solutions to a problem given a set of input conditions. For example, suppose we are mixing paint, and I show you the following spreadsheet:

PaintColor1  PaintColor2  ResultColor
Red          Blue         Purple
Red          Yellow       Orange
Blue         Yellow       Green

Without writing a single line of code, or even offering any more background information about the problem, the meaning of the data is clear. Programmatically, we can think of this as a series of if-then-else statements reading left to right, top to bottom:

C++

if (PaintColor1 == "Red" && PaintColor2 == "Blue")
{
  ResultColor == "Purple"
}
else if (PaintColor1 == "Red" && PaintColor2 == "Yellow")
{
  ResultColor = "Orange"
}
//....etc

Or in SQL:

SQL

SELECT ResultColor WHERE PaintColor1 = @PaintColor1 AND PaintColor2 = @PaintColor2

Any one of these three solutions gets the job done, but the first is certainly easier to comprehend for non-developers and is the way a lot of practical engineering and/or business data is maintained in the real world. Using a database to drive the rule certainly de-couples the logic from your source code to some extent, but in a web environment, you will have to use server callbacks or Web Services to retrieve the data, slowing the application down. What if the logic needs to change, and we start mixing three colors of paint? For the database, you might have to go back and change the database table schema and your Select statement/Stored Procedures. If you had a few hundred combinations of colors and then added a third input parameter to the hard coding method, you are in for a whole lot of monotonous error-prone coding. If you had gone with the decision table, you likely would have very little work to do, other than to copy the new rules XML file (that a non-developer/subject matter expert likely edited for you) with the added third input to your website or application. In this tutorial, I'll show you how to use the Open Source Logician Suite to accomplish this task. With the Logician package, you get three basic components, a decision table evaluator library, a decision table editor, and a dynamic data/class modeler and rules engine library.

Decision Table Engine Background

As stated before, the table rules read sequentially; for a given series of inputs, the output(s) are determined. In the example above, they work top to bottom, left to right. The decision table evaluator (EDSEngine library) creates a truth table given the information supplied to it by your code and the stored XML rule data. By itself it is stateless, but it is easy to determine and supply the necessary variables. The basic steps that occur are:

Code determines it needs to evaluate a table, asks EDSEngine what input values from the current application "state" it needs.
EDSEngine loads the table and returns the list of inputs in the table.
Code provides the corresponding list of current values for those inputs.
EDSEngine evaluates the decision table and returns results.

You should be able to automate steps 1-3 in your code, depending on how you design your data model. Something as simple as this might work:

C++

//C++
map<string, string> mAppData; //application state as attribute-value pairs
CKnowledgeBase m_TableEvaluator;

//...application stuff, you loaded the rules file, etc

string GetResultingColor()
{  
  return GetSingleSolution("ColorMixingTable", "ResultColor");
}

string GetSingleSolution(string tableToEvaluate, string nameOfOutput)
//could reuse this function for all similar aplication events
{
  vector<string> inputsNeeded = m_TableEvaluator.GetInputDependencies(tableToEvaluate);
  //from our application data, obtain the values
  for (int i = 0; i < inputsNeeded.size(); i++)
    m_TableEvaluator.SetInputValue(inputsNeeded[i], mAppData[inputsNeeded[i]]);
  
  vector<string> results = m_TableEvaluator.EvaluateTable(tableToEvaluate, nameOfOutput);
  //EDSEngine supports returning multiple true results on a sigle line,
  // but in this case we expect just a single result (the first one it finds)
  
  if (results.size() > 0)
    return results[0];
  else
    return "";
}

See the source code for this example in the ColorMixConsole application. Rule tables are stored as XML, and when "compiled" by the DecisionLogic table editor utility, linked together in a single XML file. All the values stored within the rules engine are natively strings since they get serialized to XML. In order to optimize performance, string compares are avoided when possible by numerically tokenizing all the stored values in the rules table and any input values passed in. That way, it is just comparing numbers most of the time. So in memory, the previous paint color table looks more like:

PaintColor1   PaintColor2   ResultColor
0             1             3
0             2             4
1             2             5

Suppose we pass "Blue" and "Yellow" to the previous paint table. The values for PaintColor1 and PaintColor2 that we are testing are likewise assigned 1 (Blue) and 2 (Yellow). You can also perform the following boolean operations on an input value, and de-tokenization will occur:

> : greater-than, alpha or numerical
< : less-than, alpha or numerical
!= or <> : not-equal to
[x,y] : range of values, inclusive ends
(x,y) : range of values, exclusive ends

You can mix [] and (). = is not used explicitly, this is the default behavior for a rule cell and does not require the string to be de-tokenized.

At run-time, once you pass in the input values for the table, it is broken down sequentially into a series of boolean cells, where the value of each cell is either true or false. Any input cell that you leave blank is always considered true. So if we passed the values PaintColor1 = "Blue" and PaintColor2 = "Yellow", our previous decision table looks a lot like a logical AND gate:

PaintColor1   PaintColor2   ResultColor
F             F             F
F             T             F
T             T             T <==This is our solution, corresponding
                              to the tokenized memory value of 5, 
                              whose string value is "Green"

Other EDSEngine Features of Note

You can specify more than one value in an input cell, this is called an "OR", and the test will check them against the input value just like an "or" in code: if (value1 = test || value2 = test || value3 = test ) then do something... is abbreviated as value1|value2|value3 in a cell. There is also a notion of "Global ORs" if you design the rules XML using the table designer tool, DecisionLogic. Listing many values out can be a lot of extra typing, so you can define a single list of values as a variable and reuse that variable in all your project tables. In an output cell, the "|" delimiter acts like an "and" (&&). In this way, your solution can return multiple values. The results of a table evaluation are always returned as an array (vector in C++). There is also the notion of a table being "Get One" or "Get All", which means the table designer intended for you to either return just the result of the first true row, or the combined unique results of all true rows. This is selectable in the DesicionLogic designer for every table. You of course always have the option to override it in code.

You can dynamically concatenate values into cells at run-time using the get() keyword. For instance, suppose we need an output of text for a price list display and want to drive the text by rule. We might want it to read: "You have purchased X items of price P". In the table, we could create an output: "You have purchased get(QtyOfItems) of price get(ItemPrice)" where QtyOfItems and ItemPrice are values in my application state that would have been supplied. You can also use a get() in an input to create a more dynamic test. Instead of an input cell of ">55", it could be ">get(SomeValue)".

Run-time scripting with Python (C++/C#) and JavaScript (all ports) are supported in output cells so you can perform mathematical calculations and implement more advanced rules. Your output cell will just contain the Python or JavaScript code snippet within the proper keyword, js() or py(). For a single line of code, it might look like:

JavaScript

js(return (56 * 3).toString())
//Note: you can actually omit the "return"
//and ".toString()" for a single line of code:
js(56 * 3)
//Combine eqautions and variables
js(56 * get(MultValueFromCode))

If your code has multiple lines/functions, make sure it explicitly returns a string at the end, or you will get a type-casting error. This becomes more useful when combined with the get() keyword like: js(get(value1) * get(value2)). Also note that Python is only supported in the C++/C# implementation of EDSEngine. JavaScript-based scripting is a bit more portable for the web being the native run-time scripting language of web browsers.

A rather advanced but flexible feature is callback parameters. There are special table evaluation functions with overloads to support passing additional data to EDSEngine, that is also passed to the JavaScript or Python code (EvaluateTableWithParameter). The basic idea is maybe you want to send some text or XML data from your application to a rule, modify it in the script, and pass the modifications back along with the usual result. You can find more details in the developer's documentation if the feature might be useful to you.

Relational Object Model and Implementing a Rules Engine

The use of the Relational Object Model library will demonstrate how you can extend EDSEngine with your own features for a full-blown rules engine. Instead of writing explicit classes to model physical products in an eCommerce setting, it may be useful to model the product using a tree-like object structure, similar to XML. For instance, suppose we were modeling a car. We might write C++ classes like:

C++

class CPriceableItem
{
public:
  CPriceableItem();
  string CatalogNumber;
  double Price;
  double Cost; 
};

class CEngine : public CPriceableItem
{
public:
  CEngine();
  string EngineType;
};

class CTires : public CPriceableItem
{
public:
  CTires();
  string TireType;
};
//etc, keep inheriting and adding special attributes to each class

If we model the whole Car as XML and work with it directly, the final state could instead be formed like:

XML

<Object name='Car'>
  <Object name='Engine'>
    <Attribute EngineType='V6' Price='9000' Cost='4000' CatalogNumber='V6-OCTC-GM'></Attribute> 
  </Object>
  <Object name='Tires'>
    <Attribute TiresType='17inch' Price='500' Cost='175' CatalogNumber='GY17'></Attribute> 
  </Object>
  <!--And so forth.....-->
</Object>

In the remainder of the tutorial, we will take advantage of ROM's built in automation with EDSEngine. You may also find it convenient not to have to write code for many algorithms you may need, such as to sum up the total price of the Car object. ROM supports treating the data like XML and supports XPath queries. You could just use an XPath query to get the total price of the Car object, and could even do it in an output cell of a table rule using the eval() keyword: "Total price is eval(sum(//Attribute[@Price]))", yielding the final text result of "Total price is 9500".

When using the ROM component, decision tables are evaluated against a particular "Object" node context, and can drill down into the parent nodes when an input dependency value is not found in the current context. You can also use XPath queries in your input column headers instead of dealing with input values in code, or to specify a relation between multiple "Object" nodes. See the project documentation for more information. It should be noted that the internal data storage mechanism is not XML since that would create a performance bottleneck. However, the current state can be serialized at any time, and is updated whenever a query is made.

Tutorial

In this tutorial we will demonstrate how to model a real world product configurator in both a JavaScript-enabled webpage and a C# WinForms application using the Logician tools (A C++ and Android example is also available). For this demonstration, we will attempt to model the properties and catalog number generation for a commercially available "Heavy-Duty Mill Type" Hydraulic Cylinder (technical specification PDF with source code). Given the formula for the generation of a product catalog number, a simple graphical layout, or group of selections can be defined. The names of each control will match the "attribute" names we use in the decision tables.

Before any rule evaluation can take place, you have to substantiate an instance of the Relational Object Modeler. You can pass the path to the rules XML file we will create also to load them in the second step, and finally set up the built-in rules engine implementation and apply the rules:

//C# port, Javascript is exactly the same without the namepsaces
ROMNET.ROMNode m_rootNode = new ROMNode("HydraulicCylinder");
m_rootNode.LoadRules("HydraulicCylinderRules.xml");
ROMNET.LinearEngine m_engine = 
  new ROMNET.LinearEngine(m_rootNode, "HydraulicCylinderDictionary");
m_engine.EvaluateAll();

With a minimal GUI interface of various drop-down and edit boxes built according to the properties in the product's documentation, let's start defining some simple rules based on the product literature so we can test out the application event/evaluation code we will build shortly. Open the DecisionLogic table editor and start a new project. For each control in the interface, we will define a separate decision table with the same name that specifies the available values given any necessary input conditions. Also, ROM requires that we create at least one "Dictionary" table that contains a list of all the attribute names we are using, the captions/descriptions for each, etc. Create a new table named "HydraulicCylinderDictionary", and fill it out with the attributes we will be using to model the product. They should match the control names created in the configurator GUI. The "Name" and "Description" columns are self-explanatory. The "DefaultValue" column will set the value of the attribute on application start-up if no rule table is defined for it in the "RuleTable" column. The "AttributeType" column must be one of the following types:

SINGLESELECT - for controls such as combo-boxes/drop-downs, radio buttons, etc.
MULTISELECT - multi-selection list boxes
BOOLEAN - checkboxes
EDIT - text editing fields
STATIC - read-only attributes

Values can be set by the "DefaultValue" column, or evaluated by rule.

Figure 1

The first input is "CylinderSeries". Being the first parameter of the cylinder configuration, it has no input parameters so all the "input" columns can be removed from the table. From the product documentation, we can see that there are three possibilities: 3000 psi Hydraulic, 2000 psi Hydraulic, and 250 psi Pneumatic. We can list them out in three separate output cells in a "get all" table, or put all three in a single output cell separated by a "|" as described earlier. In this case, it's just a matter of your personal style. Here we will do the latter:

Figure 2

We continue filling out a rules table for each attribute we have defined in the "dictionary". An example of an attribute that will have input conditions would be "RodDiameter". In the documentation, the available rod sizes appear to be limited by the chosen BoreDiameter. Since there are multiple possibilities for each BoreDiameter, it might save the user time if we automatically set a RodDiameter default value as well when they pick a bore. This can be done by prefixing a value with the "@" symbol. It then follows that we could create the following rules table:

Figure 3

Any decent rules engine should be able to identify invalid conditions and appropriate "triggers" to force a re-validation of the current selections. When you test the sample application, you will notice that if you change the BoreDiameter to a different value, and the currently selected RodDiameter would be out of the valid range, the value of that on-screen selection will be cleared or changed to a correct value depending on the current conditions. Any changes to the dictionary attributes are routed through event handling and the "EvaluateForAttribute" function of the rules engine which will apply any necessary changes to the current attribute values based on the rule set. The object modeler has methods available to us in order to get or set the value of an attribute, or to dump the entire application state to XML as may be needed.

In order to generate the appropriate catalog number for a given set of attribute selections, or "state", we will define another set of rules tables and call for a table evaluation directly:

//C# port
private void UpdateCatalog()
{
  //catalog number is the concat of all the chars returned
  //from the CatalogNumber table evaluation

  string[] allChars = m_rootNode.EvaluateTable("CatalogNumber", "Code", true);
  string Catnum = "";
  foreach (string subStr in allChars)
    Catnum += subStr;

  if (Catalog != null)
    Catalog.Text = Catnum;
}

Since putting all the rules for the catalog number string in one table would be a mess, we can create a "get all" style table and evaluate a separate table for each character in the catalog number using the eval(TableName, OuputColumnName) table function as shown here to branch from one table to another:

Figure 4

Figure 5

Figure 6

One of the great advantages of using these packages on the web is you can effectively eliminate server callbacks to run business logic and perform page updates. You have the ability to offload a lot of the application logic and CPU cycles onto the user's browser. With all of these components working in concert, you have Logician, a powerful, flexible, and Open Source rules engine application framework.

Download the sample application code in C#, C++, JavaScript, Flash and WinRT along with the rule table samples to get a better understanding of the application logic. Visit our project webpage at http://logician.sourceforge.net for the windows binaries and installer package. Note that the Boost C++ libraries are required to compile from source.

History

23 June, 2011: Updated the samples download.
8 April 2013: Updates for WinRT support and x64 builds
7 August 2016: Deprecated the Silverlight and Flash wrappers. New simple example product.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

Written By

Eric D Schmidt

Software Developer (Senior)

United States

I have extensive experience developing software on both Linux and Windows in C++ and Python. I have also done a lot of work in the C#/.NET ecosystem. I currently work in the fields of robotics and machine learning, and also have a strong background in business automation/rules engines.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.