I want to tokenize a given mathematical expression into a parse tree like this:
((3 + 4 - 1) * 5 + 6 * -7) / 2
'/'
/ \
+ 2
/ \
* *
/ \ / \
- 5 6 -7
/ \
+ 1
/ \
3 4
Is there any pure Python way to do this? Like passing as a string to Python and then get back as a tree like mentioned above.
Thanks.
There are many good, established algorithms for parsing mathematical expressions like this one. One particularly good one is Dijkstra's shunting-yard algorithm, which can be used to produce such a tree. I don't know of a particular implementation in Python, but the algorithm is not particularly complex and it shouldn't take too long to whip one up.
By the way, the more precise term for the tree you're constructing is a parse tree or abstract syntax tree.
I don't know of a "pure python" way to do this, that is already implemented for you. However you should check out ANTLR (http://www.antlr.org/) it's an open source parser an lexer and it has an API for a number of languages, including python. Also this website has some great video tutorials that will show you how to do exactly what you are asking. It's a very useful tool to know how to use in general.
Several parser frameworks exist for Python; some common ones are PLY and pyparsing. Ned Batchelder has a pretty complete list.
Yes, the Python
ast
module provides facilities to do this. You'll have to look up the exact interface for your version of Python, since theast
module seems to change regularly.In particular, the
ast.parse()
method will be helpful for your application:You can do this with the Python ast module.
https://docs.python.org/3.6/library/ast.html
theoperation is our mathematical operation we want to evaluate, we use the isinstance in order to know the type it is, if its a number, if its a binary operator(+,*,..). You can read at https://greentreesnakes.readthedocs.io/en/latest/tofrom.html , how the ast work
And in order to make the method work we sould use: evaluate(ast.parse(theoperation, mode='eval').body)