How to parse mathematical expressions involving pa

2020-05-26 10:42发布

问题:

This isn't a school assignment or anything, but I realize it's a mostly academic question. But, what I've been struggling to do is parse 'math' text and come up with an answer.

For Example - I can figure out how to parse '5 + 5' or '3 * 5' - but I fail when I try to correctly chain operations together.

(5 + 5) * 3

It's mostly just bugging me that I can't figure it out. If anyone can point me in a direction, I'd really appreciate it.

EDIT Thanks for all of the quick responses. I'm sorry I didn't do a better job of explaining.

First - I'm not using regular expressions. I also know there are already libraries available that will take, as a string, a mathematical expression and return the correct value. So, I'm mostly looking at this because, sadly, I don't "get it".

Second - What I've tried doing (is probably misguided) but I was counting '(' and ')' and evaluating the deepest items first. In simple examples, this worked; but my code is not pretty and more complicated stuff crashes. When I 'calculated' the lowest level, I was modifying the string.

So... (5 + 5) * 3

Would turn into 10 * 3

Which would then evaluate to 30

But it just felt 'wrong'.

I hope that helps clarify things. I'll certainly check out the links provided.

回答1:

Ages ago when working on a simple graphing app, I used this algorithm (which is reasonably easy to understand and works great for simple math expressions like these) to first turn the expression into RPN and then calculated the result. RPN was nice and fast to execute for different variable values.

Of course, language parsing is a very wide topic and there are many other ways of going about it (and pre-made tools for it too)



回答2:

@Rising Star [I hoped to add this as a comment, but the formatting failed]

It may seem counterintuitive, but a binary tree is both simpler and more flexible. A node, in this case, would be either a constant (number) or an operator. A binary tree makes life somewhat easier when you decide to extend the language with elements like control flow, and functions.

Example:

((3 + 4 - 1) * 5 + 6 * -7) / 2

                  '/'
                /     \
              +        2
           /     \
         *         *
       /   \     /   \
      -     5   6     -7
    /   \
   +     1
 /   \
3     4

In the case above the scanner has been programmed to read '-' followed by a series of digits as a single number, so "-7" gets returned as the value component of the "number" token. '-' followed by whitespace is retured as a "minus" token. This makes the parser somewhat easier to write. It fails on the case where you want "-(x * y)", but you can easily change the expression to "0 - exp"



回答3:

Here is a simple (naive operator precedence) grammar for what you want.

expression = 
    term
    | expression "+" term
    | expression "-" term .
term = 
    factor
    | term "*" factor
    | term "/" factor .
factor = 
    number
    | "(" expression ")" .

When you process "factor" you just check whether the next token is a number or "(", if it's a "(" then you parse "expression" again, when expression returns you check if the next token is ")". You could have the [calculated|read] values bubble up to the parent through the use of out or ref parameters, or build an expression tree.

Here is the same thing in EBNF:

expression = 
    term
    { "+" term | "-" term  } .

term = 
    factor
    { "*" factor | "/" factor }.

factor = 
    number
    | "(" expression ")" .


回答4:

Did you ever take a class on formal languages in school? Effectively you need a grammar to parse by.

EDIT: Oh crap, Wikipedia says I'm wrong, but now I forget the correct name :( http://en.wikipedia.org/wiki/Formal_grammar



回答5:

Last year-ish I wrote a basic math evaluator for reasons I can't remember. It is not in any way a "proper" parser by any stretch of the term, and .. like all old code, I'm not that proud of it now.

But you can take a look and see if it helps you.

You run some input tests by launching this standalone Java app



回答6:

When I wanted to parse something I decided to use the GOLD Parser:

  • Self-contained documentation (don't need a book to understand it)
  • Various run-time engines, in various programming languages including the one I wanted.

The parser includes sample grammars, including e.g. one for operator prcedence.


Apart from GOLD are also other more famous parsers, e.g. ANTLR, which I haven't used.



回答7:

Take your pick, Code Golf: Mathematical expression evaluator (that respects PEMDAS)



回答8:

As many answers have already stated, the issue is that you need a recursive parser with associativity rules because you can end up with expressions like:

val = (2-(2+4+(3-2)))/(2+1)*(2-1)

and your parser needs to know that:

  1. The parenthetic expressions are evaluated from the inside out
  2. The division takes precedence over multiplication (you first divide, then multiply the result)
  3. The multiplication takes precedence over addition/subtraction

As you can imagine, writing a (good) parser is an art. The good thing is that there are several tools, called parser generators which allow you to easily define the grammar of your language, and the parsing rules. You may want to check the entries in Wikipedia for BNF, so that you can see how a grammar is defined.

Finally, if you are doing this for learning experience, go ahead. If this is for production code, do not reinvent the wheel, and find an existing library, otherwise you risk spending 1000 lines of code to add 2+2.



回答9:

For anyone seeing this question nine years into the future from when this post was made: If you don't want to re-invent the wheel, there are many exotic math parsers out there.

There is one that I wrote years ago in Java, which supports arithmetic operations, equation solving, differential calculus, integral calculus, basic statistics, function/formula definition, graphing, etc.

Its called ParserNG and its free.

Evaluating an expression is as simple as:

    MathExpression expr = new MathExpression("(34+32)-44/(8+9(3+2))-22"); 
    System.out.println("result: " + expr.solve());

    result: 43.16981132075472

Or using variables and calculating simple expressions:

 MathExpression expr = new MathExpression("r=3;P=2*pi*r;"); 
System.out.println("result: " + expr.getValue("P"));

Or using functions:

MathExpression expr = new MathExpression("f(x)=39*sin(x^2)+x^3*cos(x);f(3)"); 
System.out.println("result: " + expr.solve());

result: -10.65717648378352

Or to evaluate the derivative at a given point(Note it does symbolic differentiation(not numerical) behind the scenes, so the accuracy is not limited by the errors of numerical approximations):

MathExpression expr = new MathExpression("f(x)=x^3*ln(x); diff(f,3,1)"); 
System.out.println("result: " + expr.solve());

 result: 38.66253179403897

Which differentiates x^3 * ln(x) once at x=3. The number of times you can differentiate is 1 for now.

or for Numerical Integration:

MathExpression expr = new MathExpression("f(x)=2*x; intg(f,1,3)"); 
System.out.println("result: " + expr.solve());

result: 7.999999999998261... approx: 8

This parser is decently fast and has lots of other functionality.

DISCLAIMER: ParserNG is authored by me.



回答10:

Essentially, you are asking us how to write a "parser." Here is another Stack Overflow question about parsers: hand coding a parser



回答11:

I did something similar to what you describe. I use recursion to parse all the parenthesis. I then use a ternary tree to represent the different segments. The left branch is the left hand side of the operator. The center branch is the operator. The right branch is the right hand side of the operator.

Short Answer Recursion and ternary trees.



回答12:

There is always an option to use math parser library, such as mXparser. You can:

1 - Check expression syntax

import org.mariuszgromada.math.mxparser.*;
...
...
Expression e = new Expression("2+3-");
e.checkSyntax();
mXparser.consolePrintln(e.getErrorMessage());

Result:

[mXparser-v.4.0.0] [2+3-] checking ...
[2+3-] lexical error 

Encountered "<EOF>" at line 1, column 4.
Was expecting one of:
    "(" ...
    "+" ...
    "-" ...
    <UNIT> ...
    "~" ...
    "@~" ...
    <NUMBER_CONSTANT> ...
    <IDENTIFIER> ...
    <FUNCTION> ...
    "[" ...

[2+3-] errors were found.

[mXparser-v.4.0.0]

2 - Evaluate expression

import org.mariuszgromada.math.mxparser.*;
...
...
Expression e = new Expression("2+3-(10+2)");
mXparser.consolePrintln(e.getExpressionString() + " = " + e.calculate());

Result:

[mXparser-v.4.0.0] 2+3-(10+2) = -7.0

3 - Use built-in functions constants, operators, etc..

import org.mariuszgromada.math.mxparser.*;
...
...
Expression e = new Expression("sin(pi)+e");
mXparser.consolePrintln(e.getExpressionString() + " = " + e.calculate());

Result:

[mXparser-v.4.0.0] sin(pi)+e = 2.718281828459045

4 - Define your own functions, arguments and constants

import org.mariuszgromada.math.mxparser.*;
...
...
Argument z = new Argument("z = 10");
Constant a = new Constant("b = 2");
Function p = new Function("p(a,h) = a*h/2");
Expression e = new Expression("p(10, 2)-z*b/2", p, z, a);
mXparser.consolePrintln(e.getExpressionString() + " = " + e.calculate());

Result:

[mXparser-v.4.0.0] p(10, 2)-z*b/2 = 0.0

5 - Tokenize expression string and play with expression tokens

import org.mariuszgromada.math.mxparser.*;
...
...
Argument x = new Argument("x");
Argument y = new Argument("y");
Expression e = new Expression("2*sin(x)+(3/cos(y)-e^(sin(x)+y))+10", x, y);
mXparser.consolePrintTokens( e.getCopyOfInitialTokens() );

Result:

[mXparser-v.4.0.0]  --------------------
[mXparser-v.4.0.0] | Expression tokens: |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------
[mXparser-v.4.0.0] |    TokenIdx |       Token |        KeyW |     TokenId | TokenTypeId |  TokenLevel |  TokenValue |   LooksLike |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------
[mXparser-v.4.0.0] |           0 |           2 |       _num_ |           1 |           0 |           0 |         2.0 |             |
[mXparser-v.4.0.0] |           1 |           * |           * |           3 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |           2 |         sin |         sin |           1 |           4 |           1 |         NaN |             |
[mXparser-v.4.0.0] |           3 |           ( |           ( |           1 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |           4 |           x |           x |           0 |         101 |           2 |         NaN |             |
[mXparser-v.4.0.0] |           5 |           ) |           ) |           2 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |           6 |           + |           + |           1 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |           7 |           ( |           ( |           1 |          20 |           1 |         NaN |             |
[mXparser-v.4.0.0] |           8 |           3 |       _num_ |           1 |           0 |           1 |         3.0 |             |
[mXparser-v.4.0.0] |           9 |           / |           / |           4 |           1 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          10 |         cos |         cos |           2 |           4 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          11 |           ( |           ( |           1 |          20 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          12 |           y |           y |           1 |         101 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          13 |           ) |           ) |           2 |          20 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          14 |           - |           - |           2 |           1 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          15 |           e |           e |           2 |           9 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          16 |           ^ |           ^ |           5 |           1 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          17 |           ( |           ( |           1 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          18 |         sin |         sin |           1 |           4 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          19 |           ( |           ( |           1 |          20 |           4 |         NaN |             |
[mXparser-v.4.0.0] |          20 |           x |           x |           0 |         101 |           4 |         NaN |             |
[mXparser-v.4.0.0] |          21 |           ) |           ) |           2 |          20 |           4 |         NaN |             |
[mXparser-v.4.0.0] |          22 |           + |           + |           1 |           1 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          23 |           y |           y |           1 |         101 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          24 |           ) |           ) |           2 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          25 |           ) |           ) |           2 |          20 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          26 |           + |           + |           1 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |          27 |          10 |       _num_ |           1 |           0 |           0 |        10.0 |             |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------

6 - You can find much more in mXparser tutorial, mXparser math collection and mXparser API definition.

7 - mXparser supports:

  • JAVA
  • .NET/MONO
  • .NET Core
  • .NET Standard
  • .NET PCL
  • Xamarin.Android
  • Xamarin.iOS

Additionally - this software is using mXparser as well - you can learn the syntax Scalar Calculator app.

Best regards