Introduction
The class presented in this article can be used to transform a mathematical expression from infix notation to Reverse Polish notation. By default, there is no validation of the expression, but some simple validation may be turned on through an Options parameter.
See Shunting yard algorithm for more information.
Background
I needed this for the ParseInfix method for my Rational type.
The class exposes only one public method: the InfixToRpn method.
string result = PIEBALD.Lib.InfixToRpn
(
source
,
PIEBALD.Lib.InfixToRpn.RpnOptions.None
) ;
string result = PIEBALD.Lib.InfixToRpn
(
source
,
PIEBALD.Lib.InfixToRpn.RpnOptions.FailOnMalformedExpression
) ;
The mathematical operators (+, -, *, /, %, ^ (exponentiation)) and parentheses are supported.
Verbatim text (which can be used for terms in Scientific notation) can be specified between the ' (ASCII 96) and ' delimiters.
Supporting Characters
The following items are also in the file:
RpnOptions
This enumeration is used to provide the user the ability to control some optional features of the InfixToRpn method.
RetainWhiteSpace
Retain the whitespace, control characters, and null characters from the source string.
FailOnMismatchedParenthesis
Throw an InfixToRpnFormatException if a mismatched parenthesis is detected.
FailOnMalformedExpression
Throw an InfixToRpnFormatException when various other situations occur.
All
All of the options.
None
None of the options.
InfixToRpnFormatException
If FailOnMismatchedParenthesis and/or FailOnMalformedExpression is specified and such a condition is detected, then an instance of this exception will be thrown. The class has a public readonly field named Position that will contain the (one-based) position of the character that appears to be errant.
OperatorPrecedence
This enumeration is used to associate a precedence with an operator in the following class.
Operator
An instance of the Operator class contains only the operator character and its precedence value.
OperatorStack
An OperatorStack is a Stack<Operator>. It has its own versions of Push and Clear which will append removed Operators directly to the output.
OperatorStackStack
An OperatorStackStack is a Stack<OperatorStack>. The InfixToRpn method uses one of these, which is initialized with one OperatorStack. Each OperatorStack represents one pair of parentheses. Each time a left parenthesis is encountered, a new OperatorStack is instantiated and pushed onto the OperatorStackStack. Each time a right parenthesis is encountered, the top OperatorStack is popped off and cleared (the bottom-most OperatorStack does not get popped off).
ValidatorState
This enumeration defines states for the validator.
NoMoreOpsAllowed
We had a second operator.
OnlyOneOpAllowed
We had a left-parenthesis.
UnaryOpAllowed
We had a first operator.
BinaryOpAllowed
We had a right-parenthesis.
TermBegun
We had a term character.
TermEnded
We had a term-ending whitespace operator.
VerbatimTextBegun
In verbatim text mode.
InfixToRpn
The method is rather lengthy so I won't present the whole thing. It consists primarily of a foreach loop across the characters of the Subject string and a switch. Here are the major pieces (result is a StringBuilder member field):
OperatorStackStack ops = new OperatorStackStack ( new OperatorStack ( 0 ) ) ;
int index = 0 ;
ValidatorState validatorstate = ValidatorState.OnlyOneOpAllowed ;
result.Length = 0 ;
result.EnsureCapacity ( Subject.Length * 2 ) ;
foreach ( char thischar in Subject )
{
index++ ;
switch ( thischar )
{
}
while ( ops.Count > 0 )
{
ops.Pop().Clear() ;
}
return ( result.ToString() ) ;
}
Inside the switch are (of course) case statements for the various characters that need to be interpreted. Here is the case for multiplication, division, and modulo:
case '*' :
case '/' :
case '%' :
{
if ( failonmalformedexpression )
{
if
(
( validatorstate == ValidatorState.UnaryOpAllowed )
||
( validatorstate == ValidatorState.NoMoreOpsAllowed )
)
{
throw ( new InfixToRpnFormatException
(
"Too many operators in a row"
,
index
) ) ;
}
}
ops.Peek().Push
(
new Operator ( thischar , OperatorPrecedence.Medium )
) ;
validatorstate = ValidatorState.UnaryOpAllowed ;
break ;
}
Here is the case for addition, subtraction, unary positive, and unary negative. Notice that unary operators receive Highest precedence while their binary counterparts receive Low precedence:
case '-' :
case '+' :
{
if ( failonmalformedexpression )
{
if ( validatorstate == ValidatorState.NoMoreOpsAllowed )
{
throw ( new InfixToRpnFormatException
(
"Too many operators in a row"
,
index
) ) ;
}
}
if
(
( validatorstate == ValidatorState.UnaryOpAllowed )
||
( validatorstate == ValidatorState.OnlyOneOpAllowed )
)
{
ops.Peek().Push
(
new Operator ( thischar , OperatorPrecedence.Highest )
) ;
validatorstate = ValidatorState.NoMoreOpsAllowed ;
}
else
{
ops.Peek().Push
(
new Operator ( thischar , OperatorPrecedence.Low )
) ;
validatorstate = ValidatorState.UnaryOpAllowed ;
}
break ;
}
A left-parenthesis causes a new OperatorStack to be pushed onto the OperatorStackStack. Here is the case for left-parenthesis:
case '(' :
{
if ( failonmalformedexpression )
{
if
(
( validatorstate == ValidatorState.TermBegun )
||
( validatorstate == ValidatorState.TermEnded )
||
( validatorstate == ValidatorState.BinaryOpAllowed )
)
{
throw ( new InfixToRpnFormatException
(
"No operator preceding left-parenthesis"
,
index
) ) ;
}
}
ops.Push ( new OperatorStack ( index ) ) ;
validatorstate = ValidatorState.OnlyOneOpAllowed ;
break ;
}
A right-parenthesis causes the top OperatorStack to be popped from the OperatorStackStack (unless it's the only OperatorStack) and Cleared. Here is the case for right-parenthesis:
case ')' :
{
if ( failonmalformedexpression )
{
if
(
( validatorstate == ValidatorState.UnaryOpAllowed )
||
( validatorstate == ValidatorState.NoMoreOpsAllowed )
)
{
throw ( new InfixToRpnFormatException
(
"Operator followed by right-parenthesis"
,
index
) ) ;
}
if ( validatorstate == ValidatorState.OnlyOneOpAllowed )
{
throw ( new InfixToRpnFormatException
(
"Empty parentheses"
,
index
) ) ;
}
}
if ( ops.Count == 1 )
{
if ( failonmismatchedparenthesis )
{
throw ( new InfixToRpnFormatException
(
"Mismatched right-parenthesis"
,
index
) ) ;
}
ops.Peek().Clear() ;
}
else
{
ops.Pop().Clear() ;
}
validatorstate = ValidatorState.BinaryOpAllowed ;
break ;
}
The last case I'll show is for the terms of the expression, the operands. A term is basically any consecutive characters that are neither operators nor whitespace. (Verbatim text forms a term, and may contain operator characters and whitespace; but verbatim text has its own special processing and I won't cover it here.)
default :
{
if ( failonmalformedexpression )
{
if ( validatorstate == ValidatorState.BinaryOpAllowed )
{
throw ( new InfixToRpnFormatException
(
"No operator preceding term"
,
index
) ) ;
}
if ( validatorstate == ValidatorState.TermEnded )
{
throw ( new InfixToRpnFormatException
(
"Second consecutive term detected"
,
index
) ) ;
}
}
result.Append ( thischar ) ;
validatorstate = ValidatorState.TermBegun ;
break ;
}
Using the Code
The zip also contains Rpn.cs, a very simple console program. Compile it and LibRpn.cs with your choice of C# compiler.
Sample output:
C:\>rpn 2+3*4
2 3 4 * +
C:\>rpn a/b-c
a b / c -
C:\>rpn x * -4
x
C:\>rpn "x * -4"
x 0 4 - *
C:\>rpn "x - -4"
x 0 4 - -
C:\>rpn "x - 1/4"
x 1 4 / -
C:\>rpn "x - /4"
Too many operators in a row at position 5
C:\>rpn "x ---4"
Too many operators in a row at position 5
History
- First posted - 2006-01-16
- Updated - 2006-11-27
- Updated - 2009-03-10 - Reworked the article. There are only minor changes to the code, mostly to the comments.