|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionModern calculators solve differential equations and draw plots. They work with trigonometry, hyperbolic, logarithmic, statistic and other functions. Modern calculators contain processors that are much stronger than their predecessors in old "Apollo"-like spaceships. This article describes the easiest way to build a parser with the ability to analyze complex mathematical expressions and evaluate them. The algorithm is the one that is commonly used in the calculators. Problem DefinitionLet's decide what kind of expressions our calculator should handle.
Reverse Polish Notation and Shunting Yard AlgorithmA Prefix (Polish) notation was introduced in 1920 by the Polish mathematician Jan Lukasiewicz. It is a mathematical notation wherein every operator follows all of its operands. It is also known as Postfix notation. Reverse Polish notation (RPN) was invented by Australian philosopher and computer scientist Charles Hamblin in the mid-1950s, to enable zero-address memory stores. Hamblin presented his work at a conference in June 1957, and published it in 1957 and 1962. In Reverse Polish notation, the operators follow their operands. Infix a + b changes to ab+; (a + b)*c changes to ab+c*. One of the RPN advantages is that it obviates the need for parentheses. The main advantage is that it's easy to use RPN with stack (first-in-last-out) structures. The shunting yard algorithm is a method for parsing mathematical equations specified in infix notation. It can be used to produce output in RPN. The algorithm was invented by Edsger Dijkstra and named the "shunting yard" because its operation resembles that of a railroad shunting yard. Like the evaluation of RPN, the shunting yard algorithm is stack-based. One of the stacks contains operands (digits and variables). Another stack contains all operators (+, -, *, /, etc. and functions that are treated as operators). Inside the Shunting Yard AlgorithmThis is a brief example of how "shunting yard" works. Input expression: (2+3)*4/5; First of all, we have to push a special "sentinel" operator into the operators stack. The sentinel is used to separate operations inside the parentheses and to indicate the bottom of the stack.
The parser reads the expression from left to right. The first token is "(" - right parenthesis. The sentinel is pushed into the operators stack. When the parser finishes processing the current token (speaking about any token), it reads the next token, stores it into the "token" variable and processes it.
Calculation: Pop "+" from the stack. Compare the operator's precedence. "#" has lower precedence than "+". That's why "+" is processed. "+" is a binary operator; pop two operands ("3" and "2") from the Operands stack. Calculate 2+3 and push the result into the stack. The sentinel (#) is reached. Pop the sentinel from the stack and read the next token.
Calculation: Pop "*" from the stack. Compare the operators' precedence. "#" has lower precedence than "*". "*" is a binary operator; pop "5" and "4" from the operands stack. Calculate 5*4 and push the result back into the stack. Push "/" into the stack. Read the next token: (2+3)*4/5.
The result is 20/5 = 4 Let's take a look at any simple or complex mathematical expression. We know what kind of token should come first in infix notation. It's a unary operator (like unary minus), left parenthesis, a number or a variable. For instance: -10; a+b; (10-x); The
The beauty of the algorithm in its recursion. A listing below shows the basic algorithm's implementation in pseudo-C#: Stack Operands, Operators;
Token token; // current token
Parse()
{
// Indicates the beginning of the stack
Operators.Push(Sentinel);
Binary();
Expect(End)
return Operands.Peek(); //contains the result
}
Binary()
{
Primary();
while token is a binary operator
{
PushOperator(token);
NextToken(); // read next token
Primary(); // parse primary
}
// while there are operators to evaluate
while Operators.Peek != Sentinel
PopOperator();
}
Primary()
{
if token is operand //(digit or char)
ReadOperand();
else if token is unary
PushOperator(ToUnary(token));
NextToken();
Primary();
else if token is left parenthesis
Operators.Push(Sentinel);
NextToken();
Binary();
Expect (Right parentesis);
Operators.Pop();
else
throw exception;
}
PushOperator(op)
{
while Operators.Peek() > op // Precedence()
PopOperator()
Operators.Push(op)
}
PopOperator()
{
if Operators.Peek is binary
o2 = Operands.Pop();
o1 = Operands.Pop();
op = Operators.Pop();
Operands.Push( Calculate(op, o1, o2) );
else // unary operator
o1 = Operands.Pop();
op = Operators.Pop();
Operands.Push( Calculate(op, o1) );
}
private void Expect(expectedToken)
{
if (token == expectedToken)
{
NextToken();
return;
}
else
throw exception
}
As you can see, the basic implementation is very easy. This scheme is shown in the Basic calculator example. Variables, Functions and Other EnhancementsVariables and FunctionsLet's consider an expression: a+cos(pi) where a and pi are variables and cos is the cosine function. As defined above, the Each character sequence that is not a function (x, y1, sinX, etc.) is treated as a variable. The best way to store variables is in a In the case of a function (in our example, a+cos(pi) ), the parser reads the "c" token and does the same thing as was done with the variable: it reads "c", "o", "s", finds the escape "(" character and locates the cos function in the functions list. Then the parser calls (remember the recursion?) Variables AssignmentTherefore there are stored variables (pi, e) and there should be variables assignment: a=b=10^2. The "=" operator calls MultiplicationIn mathematical expressions, multiplication is sometimes omitted. For example: the 10*x expression could be replaced with 10x. Let's define all cases in which multiplication could be omitted: x*(y), (y)*x, (y)*(x), x*f(x). When the parser finds a number, a variable or a right parenthesis, it calls the Right-side OperatorsOne of the unary right-side operators in mathematics is the Factorial operator. Factorial is the product of all positive integers from one up to n and including a given integer. Factorial zero is assigned the value of one, factorial n (n!) is 1*2*...*(n-1)*n. To check if the next token is a right-side operator, the Power OperationThere is a basic math rule which I had forgotten. The power operator could not be evaluated the same way as multiplication or addition operators are evaluated. 3^4^5 = 3^(4^5), but 3*4*5 = (3*4)*5 or 3*(4*5). This is called associative operation. When the Using the CodeThere is one partial class
Points of InterestThe next step is to slightly modify the code and make the calculator able to solve logical expressions. History
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||