I am going to write an expression evaluator which only does addition and subtraction. I have a simple algorithm to do that; but, I have some implementation problems.
I considered an expression as (it is a String)
"(" <expression1> <operator> <expression2> ")"
Here is my algorithm
String evaluate( String expression )
if expression is digit
return expression
else if expression is "(" <expression1> <operator> <expression2> ")"
cut the brackets out of it
expression1 = evaluate( <expression1> )
operator = <operator>
expression2 = evaluate( <expression2> )
if operator is +
expression1 + expression2
else if operator is -
expression1 - expression2
My problem is parsing <expression1>
, <operator>
and <expression2>
from the expression. How can I do that?
Note: I'm not asking for a code. All I need is an idea to do that.
Thank you,
-Ali
Use a StringTokenizer to split your input string into parenthesis, operators and numbers, then iterate over your tokens, making a recursive call for every open-parens, and exiting your method for every close parenthesis.
I know you didn't ask for code, but this works for valid input:
It would need better handling of invalid input, but you get the idea...
Either you use a parser generator such as JavaCUP or ANTLR. Write up a BNF of your expression and generate a parser. Here is a sample grammar that would get you started:
A "hacky" way of doing it yourself would be to look for the first
)
backtrack to the closest(
look at the parenthesis free expression in between, simply split on the operator symbols and evaluate.Perhaps create regular expressions for expression and operator and then use matching to identify and break out your content.
Don't do that, then :) When you see an opening bracket, do your recursive call to expression. At the end of the expresssion, either you find another operator (and so you're not at the end of the expression after all), or a right-bracket, in which case you return from the evaluate.
I would recommend changing the infix input into postfix and then evaluating it, rather than reducing the expression infix-wise. There are already well defined algorithms for this and it doesn't come with the inherent multiple-nested-parentheses parsing problems.
Take a look at the Shunting Yard Algorithm to convert to postfix/RPN then evaluate it using a stack using Postfix Operations. This is fast (O(n)) and reliable.
HTH
I would suggest taking an approach that more closely resembles the one described in this old but (in my opinion) relevant series of articles on compiler design. I found that the approach of using small functions/methods that parse parts of the expression to be highly effective.
This approach allows you to decompose your parsing method into many sub-methods whose names and order of execution closely follows the EBNF you might use to describe the expressions to be parsed.