I'm trying to develop an equation parser using a compiler approach in Python. The main issue that I encounter is that it is more likely that I don't have all the variables and need therefore to look for sub-formulas. Let's show an example that is worth a thousand words ;)
I have four variables whom I know the values: vx, vy, vz and c:
list_know_var = ['vx', 'vy', 'vz', 'c']
and I want to compute the Mach number (M) defined as
equation = 'M = V / c'
I already know the c variable but I don't know V. However, I know that the velocity V that can be computed using the vx, vy and vz and this is stored in a dictionary with other formulas (here only one sub formula is shown)
know_equations = {'V': '(vx ** 2 + vy ** 2 + vz ** 2) ** 0.5'}
Therefore, what I need is to parse the equation and check if I have all the variables. If not, I shall look into the know_equations dictionary to see if an equation is defined for it and this recursively until my equation is well defined.
For now on, I have been able using the answer given here to parse my equation and know if I know all the variables. The issue is that I did not find a way to replace the unknown variable (here V) by its expression in know_equation:
parsed_equation = compiler.parse(equation)
list_var = re.findall("Name\(\'(\w*)\'\)", str(parsed_equation.getChildNodes()[0]))
list_unknow_var = list(set(list_var) - set(list_know_var))
for var in list_unknow_var:
if var in know_equations:
# replace var in equation by its expression given in know_equations
# and repeate until all the variables are known or raise Error
pass
Thank you in advance for your help, much appreciate!
Adrien
So i'm spitballing a bit, but here goes.
The compiler.parse
function returns an instance of compiler.ast.Module
which contains an abstract syntax tree. You can traverse this instance using the getChildNodes
method. By recursively examining the left
and right
attributes of the nodes as you traverse the tree you can isolate compiler.ast.Name
instances and swap them out for your substitution expressions.
So a worked example might be:
import compiler
def recursive_parse(node,substitutions):
# look for left hand side of equation and test
# if it is a variable name
if hasattr(node.left,"name"):
if node.left.name in substitutions.keys():
node.left = substitutions[node.left.name]
else:
# if not, go deeper
recursive_parse(node.left,substitutions)
# look for right hand side of equation and test
# if it is a variable name
if hasattr(node.right,"name"):
if node.right.name in substitutions.keys():
node.right = substitutions[node.right.name]
else:
# if not, go deeper
recursive_parse(node.right,substitutions)
def main(input):
substitutions = {
"r":"sqrt(x**2+y**2)"
}
# each of the substitutions needs to be compiled/parsed
for key,value in substitutions.items():
# this is a quick ugly way of getting the data of interest
# really this should be done in a programatically cleaner manner
substitutions[key] = compiler.parse(substitutions[key]).getChildNodes()[0].getChildNodes()[0].getChildNodes()[0]
# compile the input expression.
expression = compiler.parse(input)
print "Input: ",expression
# traverse the selected input, here we only pass the branch of interest.
# again, as with above, this done quick and dirty.
recursive_parse(expression.getChildNodes()[0].getChildNodes()[0].getChildNodes()[1],substitutions)
print "Substituted: ",expression
if __name__ == "__main__":
input = "t = r*p"
main(input)
I have admittedly only tested this on a handful of use cases, but I think the basis is there for a generic implementation that can handle a wide variety of inputs.
Running this, I get the output:
Input: Module(None, Stmt([Assign([AssName('t', 'OP_ASSIGN')], Mul((Name('r'), Name('p'))))]))
Substituted: Module(None, Stmt([Assign([AssName('t', 'OP_ASSIGN')], Mul((CallFunc(Name('sqrt'), [Add((Power((Name('x'), Const(2))), Power((Name('y'), Const(2)))))], None, None), Name('p'))))]))
EDIT:
So the compiler module is depreciated in Python 3.0, so a better (and cleaner) solution would be to use the ast
module:
import ast
from math import sqrt
# same a previous recursion function but with looking for 'id' not 'name' attribute
def recursive_parse(node,substitutions):
if hasattr(node.left,"id"):
if node.left.id in substitutions.keys():
node.left = substitutions[node.left.id]
else:
recursive_parse(node.left,substitutions)
if hasattr(node.right,"id"):
if node.right.id in substitutions.keys():
node.right = substitutions[node.right.id]
else:
recursive_parse(node.right,substitutions)
def main(input):
substitutions = {
"r":"sqrt(x**2+y**2)"
}
for key,value in substitutions.items():
substitutions[key] = ast.parse(substitutions[key], mode='eval').body
# As this is an assignment operation, mode must be set to exec
module = ast.parse(input, mode='exec')
print "Input: ",ast.dump(module)
recursive_parse(module.body[0].value,substitutions)
print "Substituted: ",ast.dump(module)
# give some values for the equation
x = 3
y = 2
p = 1
code = compile(module,filename='<string>',mode='exec')
exec(code)
print input
print "t =",t
if __name__ == "__main__":
input = "t = r*p"
main(input)
This will compile the expression and execute it in the local space. The output should be:
Input: Module(body=[Assign(targets=[Name(id='t', ctx=Store())], value=BinOp(left=Name(id='r', ctx=Load()), op=Mult(), right=Name(id='p', ctx=Load())))])
Substituted: Module(body=[Assign(targets=[Name(id='t', ctx=Store())], value=BinOp(left=Call(func=Name(id='sqrt', ctx=Load()), args=[BinOp(left=BinOp(left=Name(id='x', ctx=Load()), op=Pow(), right=Num(n=2)), op=Add(), right=BinOp(left=Name(id='y', ctx=Load()), op=Pow(), right=Num(n=2)))], keywords=[], starargs=None, kwargs=None), op=Mult(), right=Name(id='p', ctx=Load())))])
t = r*p
t = 3.60555127546