Click here to Skip to main content
15,895,606 members
Articles / Programming Languages / C#

An extensible math expression parser with plug-ins

Rate me:
Please Sign up or sign in to vote.
4.92/5 (147 votes)
13 Mar 2008CPOL51 min read 1.5M   29K   364  
Design and code for an extensible, maintainable, robust, and easy to use math parser.
/**************************************************

  History
______________________________________________
  Date				Autor			Description
______________________________________________

  June 1, 2004		M.J.			First version

  June 16, 2004		M.J.			- Performance optimization: suppression of the
										unnessary function parameters, in order to reduce
										the call time of highly called functions like
										"evaluate".  Suppression of the unnessary used of
										vector[] accessor.  Instead, the item is accessed once and put in
										a temp variable.
										NOTE on performance: I tried a non-recursive method for the evaluation, but the
											performance has decreased in release mode.  In debug, performance is really better,
											but it seems that the compiler can better optimize the recursive version...
									- Bug fix: when poping the remaining operators at the end of the parsing step,
										the precedence was wrong.  This was an index problem.
									- Functions and operators usage validation have been put in the parsing step.  
										getHelpString is now more helpful.
									- Portability enhancement: all CComBSTR replaced by std::wstring and const wchar_t*

  July 11, 2004		M.J.			- Support for the unary minus operator : expressions �x, x-�y and x+�y are now supported.
									The unary minus operator use a different symbol than the minus operator.  So, no special
									preprocessing code has been needed, only a new operator called UnaryMinusOp.
									- Performance optimization: constant expressions are not reevaluated at each evaluate call.
									The result is computed only once during the parsing step and cached.									
									- Performance optimization: validation code removed from the evaluate method.  All validations
									should have been done during the parsing step.
									- Parsing Error message enhancement: the position in the expression string of the bad item has been added
									to the error message.
									- Bug fix minor: the isOnlyNum method used the isalpha API function to validate that there was
									no alpha character.  So the character '!' was considered as a numeric character.  Now isOnlyNum use the
									iswdigit API.
									- Bug fix trivial: if the function separator character ',' was used outside a function then the parser crash.  Proper
									validations have been added.
									- Bug fix minor in the parsing validation logic
 
  July 18, 2004		M.J.			- New feature: Configurable syntax-> decimal point character and function argument separator character.
									- New feature: Repackaged as a static library
									- Enhancement validation: the number of function arguments is now validated.
									- Enhancement validation: don't allow things like : cos(x)sin(x).  The * operator is not implicit.
									- Enhancement error report: new method "getDescription" to return a short description of functions and operators.
									- Enhancement validation: validate that an operator symbol or that a function symbol is not already used when
									defining a new operator or a new function.

  July 22, 2004		M.J.			- New feature: Operator symbols can be more than one character long.
									- Enhancement: New logical operators; >, <, >=, <=

  July 29, 2004		M.J.			- Enhancement: Operator overloading; Operators can have only one argument (no left argument).  
									This allows the unary minus operator to use the same symbol as the minus operator.  Other operators
									like the not (!x) operator could be added.

  July 31, 2004		M.J.			- New feature: function overloading.  
									- New feature: function with an undefined number of arguments.  You can now define
									functions like average(v1, v2, v3,...).
									- Change: the evaluate method now takes an argument array and doesn't need to pop 
									values from the parser anymore.  This make the code more robust and ease the validation.
									The IMathExprEvaluator interface has been removed.  Operators and functions don't need to query the parser
									for argument value anymore, so this interface was useless.  Hopefully, this simplify the code.
									- Enhancement: Unicode is no more mandatory.  If UNICODE is not defined, the std::string
									will be used instead of the std::wstring.
									- Enhancement: Test cases created!  With so many special cases to take care of for the parsing
									I want to make sure to not break anything when modifying the code. So, when an expression that breaks
									the code is found, I will add a test case for it to make sure it will never happen again.
									- Performance optimization: use of vector::iterator instead of the [] operator during the evaluation
									step.  The code is less clear but this saves some CPU.
									- Enhancement: all error strings have been moved outside of the parser.  This will ease the internationalization task.
									Now there are exception ids with parameters and you format the error messages as you want, in the language you want.
									- New feature: Constants are now supported.  So instead of using variables to represent constant values, there are specific
									methods to handle constants and to optimize the evaluation speed.
									- New feature: The variables used in the expression can be retrived via the getExpressionVars method.  This can be useful for
									example when solving equations; you must know the variables.


 August 2, 2004		M.J.			-Huge! Performance optimization: Iterative evaluation algorithm instead of the recursive one.  Not only that, but the item values
									are automatically put in the parent arguments.  So, there is no more loop needed to place all argument values in the
									item argument buffer.


 August 3, 2004		M.J.			- Bug fix parsing: Problem with supplementary brackets like: sin(((1))).  Fixed.
									- Enhancement validation: don't allow syntax like: "avg(x,,1,)".  The argument separators are useless and confusing.
									 
		
 August 6, 2004		M.J.			- Enhancement: Undefined number of argument functions are handled like "default" functions when no other overloaded functions
									macth the number of arguments.   
									- Enhancement: Undefined number of argument functions need to have more than one argument.  This avoid, in most cases, having to handle
									the special case of 0 argument. 
									- New functions: overloaded functions->min, max.  The goal is to provide fixed argument number functions for the
									most common cases, because fixed argument functions are faster than function with an undefined number of arguments (no loop).						

 August 8, 2004		M.J.			- Enhancement validation: language consistency check when adding operators, functions, variables and constants.  Validation is also done
									when setting the syntax. Creation of a new type of exception to handle definition errors. (8h)
									- Enchancement validation: catch invalid numbers like "5.2.3".  If there are more than one decimal character points then
									this is not a number. 

 August 19, 2004	M.J				- New feature: COM PLUGINS!  A plugin can contain constants, operators and functions: all in the same COM object.  This is to minimize the
									number of plugin DLLs to distribute, because it is hard to manage.  The parser has not been modified: COM operator and function adaptors have been
									created and a plugin loader.  That's all!

 September 1, 2004	M.J.			-Bug fix: Thank to Mikie for reporting this bug.  In the MathExpression constructor, the setSyntax method
									must be called before defining functions and operators.  Else, the syntax structure is uninitialized and 
									the syntax validation can fail wrongly.

 September 10, 2004 M.J.			-New feature (requested by Mikie, thanx!): variable name begin and end delimiter characters.  For example: [varName].  This allows
									a variable name to contain reserved characters like operator names and syntax characters.  Thus, there is no more
									validation for conflict with operator names and syntax elements.  If you think there is a chance that conflicts occur, then
									always use the delimiter characters.  The rationale to add this feature is that your application doesn't know which operators and syntax
									the user will use.  So when the user loads a plugin, there is a chance for conflicts with application defined variables.   
									-Performance Enhancement: variable list is now handled with a map.  This speed up the parsing step when using thousands of variables.  Using maps
									for other items (operators, functions and constants) is not necessary since these other items will not come in thousands. 

 September 24, 2004	M.J.			-Enhancement validation: Validate that constant, operator and function names don't contain braket characters.

 October 4, 2004	M.J.			-New feature (requested by Randall Parker, thanx): Undefining variables.  The new method undefineVar allows to undefine a defined variable by specifying its name. (1h)
									-New operators: The equal(==), not equal(!=), and (&), or (|), not (!) logical operators have been added. (1h)
									-New feature (requested by Randall Parker, thanx): variable autodefinition.  When a variable is not defined, the parser can automatically define it instead of
									throwing a parsing exception.  When having a lot of variables, this allows not defining all variables up front, but defining only the variables used in
									the expression.  This feature is turned off by default.  It can be activated by using the new setConfig method.  The rationale to include this feature is to continue
									the support of a huge amount of user defined variables.  Hence, this capability can distinguish this parser. (3h)
									-Maintenance: renaming of MATHEXPRESSIONSYNTAX to MTSYNTAX.  This is more in line with the library name.
									-Maintenance: renaming of the getExpressionVars method to getExpVars to be consistent with the new method getExpUndefinedVars
									-Maintenance: the getExpVars returns a pointer on a vector containing the used variable names in the current expression instead of
									a copy of a vector.  This can save some CPU when there are many variables used.
  
 October 5, 2004	M.J.			-Maintenance: Because the parser class was becoming too large, it has been splitted in two classes: a compiler and a registrar.  These are two differents responsabilities
									and the division will allow the substitution of different compiling and registration behaviours easily.  The MathExpression class is now a facade that delegates the work to a compiler
									and a registrar. (4h)
									-Maintenance: signed/unsigned mismatch warnings removed by using the "unsigned int" type instead of "int".  This has introduced some subtle bugs because there were inverse
									loops like "for(int t=n-1; t>=0;t--)" that use the fact that 0-1 = -1, but with unsigned int, 0-1 = 2^32-1...!  Unit tests caught it! (1h)
									-New feature (Inspired by Stephen Lundmark. Thanx!): Custom variable evaluator interface to allow variable values to be gotten from various sources like database or memory. (2h)
									-New feature: Variable factory.  Used with the variable autodefinition feature, this allows a custom variable factory to be provided to create the proper variable object type. (1h)

 October 7, 2004	M.J.			-Bug fix (reported by Randall Parker, thank you!): When declaring a variable in a for loop, with Microsoft compiler the scope of the variable doesn't stop after the loop.  
									This is not compliant with the compiler standard.  All usages of this MS specific property have been removed.
									-Enhancement validation: variable names can contain all syntax elements but variable delimiters.  If a variable name would contain a delimiter, the parser would interpret it wrongly.
									 

 November 12, 2004	M.J.			-Usability improvement based on the usability inspection result: (5h)							
										- MTParser.h file reordered based on the task frequencies
										- Item sharing feature removed (can't share function, operator and variable objects anymore).
										This feature was not used.
										- Renaming:
											- MathExpression to MTParser
											- IMathFunction, IMathOperator and IMathVariable to MTxxxI
											- getExpVars to getUsedVars
											- getExpUndefinedVars to getUndefinedVars
											- oneArgumentOp to isUnaryOp
											- EXPR_VALTYPE to MTDOUBLE
											- const std::vector<MTSTRING>* to LPCMTSTRINGVECT
										- New method to ease the evaluation of an expression only once
										- New method to ease the use of the locale settings
										- MTCONFIG structure removed and replaced by a specific method to enable the autoVarDef feature
										- New method to configure a parser object using an existing object
										- Default error message provided with all exceptions
										- Exception hierarchy created with a simple text exception at its top to allow simple
										error management when no more details are needed

										
  
  TODO:

   - positions are wrong! spaces have been removed... 
   - Units
   - hex, oct, bin: new method bool isValue(const MTSTRING &word, EXPRVAL &val)
   
  

**************************************************/

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
Canada Canada
Software Engineer working at a fun and smart startup company

Comments and Discussions