Writing a simple equation parser

2019-01-08 12:51发布

What sorts of algorithms would be used to do this (as in, this is a string, and I want to find the answer):

((5 + (3 + (7 * 2))) - (8 * 9)) / 72

Say someone wrote that in, how could I deal with so many nested parenthesis?

10条回答
贼婆χ
2楼-- · 2019-01-08 13:24

Or you can just do this in one line in R:

> eval(parse(text = '((5 + (3 + (7*2))) - (8 * 9))/72' ))
[1] -0.6944444
查看更多
Evening l夕情丶
3楼-- · 2019-01-08 13:34

You could use either a state machine parser (yacc LALR, etc.), or a recursive descent parser.

The parser could emit RPN tokens to evaluate or compile later. Or, in an immediate interpreter implementation, a recursive descent parser could calculate subexpressions on the fly as it returns from the leaf tokens, and end up with the result.

查看更多
劳资没心,怎么记你
4楼-- · 2019-01-08 13:35

You can use Shunting yard algorithm or Reverse Polish Notation, both of them are using stacks to handle this, wiki said it better than me.

From wiki,

While there are tokens to be read:

    Read a token.
    If the token is a number, then add it to the output queue.
    If the token is a function token, then push it onto the stack.
    If the token is a function argument separator (e.g., a comma):

        Until the token at the top of the stack is a left parenthesis, pop operators off the stack onto the output queue. If no left parentheses are encountered, either the separator was misplaced or parentheses were mismatched.

    If the token is an operator, o1, then:

        while there is an operator token, o2, at the top of the stack, and

                either o1 is left-associative and its precedence is less than or equal to that of o2,
                or o1 is right-associative and its precedence is less than that of o2,

            pop o2 off the stack, onto the output queue;

        push o1 onto the stack.

    If the token is a left parenthesis, then push it onto the stack.
    If the token is a right parenthesis:

        Until the token at the top of the stack is a left parenthesis, pop operators off the stack onto the output queue.
        Pop the left parenthesis from the stack, but not onto the output queue.
        If the token at the top of the stack is a function token, pop it onto the output queue.
        If the stack runs out without finding a left parenthesis, then there are mismatched parentheses.

When there are no more tokens to read:

    While there are still operator tokens in the stack:

        If the operator token on the top of the stack is a parenthesis, then there are mismatched parentheses.
        Pop the operator onto the output queue.

Exit.
查看更多
唯我独甜
5楼-- · 2019-01-08 13:35

James has provided a good answer. Wikipedia has a good article on this as well.

If (and I don't recommend this) you wanted to parse that expression directly, given that it seems orderly in that every set of parens has no more than one pair of operands, I think you could approach it like this:

parse to the first ")". Then parse back to the previous "(". Evaluate what's inside and replace the whole set with a value. Then repeat recursively until you are done.

So in this example, you would first parse "(7 * 2)" and replace it with 14. Then you would get (3 + 14) and replace it with 17. And so on.

You can do that with Regex or even .IndexOf and .Substring.

I'm going without benefit of checking my syntax here, but something like this:

int y = string.IndexOf(")");  
int x = string.Substring(0,y).LastIndexOf("(");  
string z = string.Substring(x+1,y-x-1) // This should result in "7 * 2"

You'll need to evaluate the resulting expression and loop this until the parens are exhausted and then evaluate that last part of the string.

查看更多
Animai°情兽
6楼-- · 2019-01-08 13:41

I would use the tools that are available nearly everywhere.
I like lex/yacc because I know them but there are equivalents everywhere. So before you write complex code see if there are tools that can help you to make it simple (problems like this have been solved before so don;t re-invent the wheel).

So, using lex(flex)/yacc(bison) I would do:

e.l

%option noyywrap

Number      [0-9]+
WhiteSpace  [ \t\v\r]+
NewLine     \n
%{
#include <stdio.h>
%}

%%

\(              return '(';
\)              return ')';
\+              return '+';
\-              return '-';
\*              return '*';
\/              return '/';

{Number}        return 'N';
{NewLine}       return '\n';
{WhiteSpace}    /* Ignore */

.               fprintf(stdout,"Error\n");exit(1);


%%

e.y

%{
#include <stdio.h>
    typedef double (*Operator)(double,double);
    double mulOp(double l,double r)  {return l*r;}
    double divOp(double l,double r)  {return l/r;}
    double addOp(double l,double r)  {return l+r;}
    double subOp(double l,double r)  {return l-r;}
extern char* yytext;
extern void yyerror(char const * msg);
%}

%union          
{
    Operator        op;
    double          value;
}

%type   <op>        MultOp AddOp
%type   <value>     Expression MultExpr AddExpr BraceExpr

%%

Value:          Expression '\n'   { fprintf(stdout, "Result: %le\n", $1);return 0; }

Expression:     AddExpr                          { $$ = $1;}

AddExpr:        MultExpr                         { $$ = $1;}
            |   AddExpr   AddOp  MultExpr        { $$ = ($2)($1, $3);}

MultExpr:       BraceExpr                        { $$ = $1;}
            |   MultExpr  MultOp BraceExpr       { $$ = ($2)($1, $3);}

BraceExpr:      '(' Expression ')'               { $$ = $2;}
            |   'N'                              { sscanf(yytext,"%le", &$$);}

MultOp:         '*'                              { $$ = &mulOp;}
            |   '/'                              { $$ = &divOp;}
AddOp:          '+'                              { $$ = &addOp;}
            |   '-'                              { $$ = &subOp;}
%%

void yyerror(char const * msg)
{
    fprintf(stdout,"Error: %s", msg);
}

int main()
{
    yyparse();
}

Build

> flex e.l
> bison e.y
> gcc *.c
> ./a.out
((5 + (3 + (7 * 2))) - (8 * 9)) / 72
Result: -6.944444e-01
>

The above also handles normal operator precedence rules:
Not because of anything I did,but somebody smart worked this out ages ago and now you can get the grammar rules for expression parsing easily (Just google C Grammer and rip the bit you need out).

> ./a.out
2 + 3 * 4
Result: 1.400000e+01
查看更多
ゆ 、 Hurt°
7楼-- · 2019-01-08 13:41

If the expressions are known to be fully-parenthesized (that is, all possible parentheses are there), then this can easily be done using recursive-descent parsing. Essentially, each expression is either of the form

 number

or of the form

 (expression operator expression)

These two cases can be distinguished by their first token, and so a simple recursive descent suffices. I've actually seen this exact problem given out as a way of testing recursive thinking in introductory programming classes.

If you don't necessarily have this guarantee, then some form of precedence parsing might be a good idea. Many of the other answers to this question discuss various flavors of algorithms for doing this.

查看更多
登录 后发表回答