可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm currently trying to create a Python script that will autogenerate space-delimited arithmetic expressions which are valid. However, I get sample output that looks like this: ( 32 - 42 / 95 + 24 ( ) ( 53 ) + ) 21
While the empty parentheses are perfectly OK by me, I can't use this autogenerated expression in calculations since there's no operator between the 24 and the 53, and the + before the 21 at the end has no second argument.
What I want to know is, is there a way to account for/fix these errors using a Pythonic solution? (And before anyone points it out, I'll be the first to acknowledge that the code I posted below is probably the worst code I've pushed and conforms to...well, very few of Python's core tenets.)
import random
parentheses = ['(',')']
ops = ['+','-','*','/'] + parentheses
lines = 0
while lines < 1000:
fname = open('test.txt','a')
expr = []
numExpr = lines
if (numExpr % 2 == 0):
numExpr += 1
isDiv = False # Boolean var, makes sure there's no Div by 0
# isNumber, isParentheses, isOp determine whether next element is a number, parentheses, or operator, respectively
isNumber = random.randint(0,1) == 0 # determines whether to start sequence with number or parentheses
isParentheses = not isNumber
isOp = False
# Counts parentheses to ensure parentheses are matching
numParentheses = 0
while (numExpr > 0 or numParentheses > 0):
if (numExpr < 0 and numParentheses > 0):
isDiv = False
expr.append(')')
numParentheses -= 1
elif (isOp and numParentheses > 0):
rand = random.randint(0,5)
expr.append(ops[rand])
isDiv = (rand == 3) # True if div op was just appended
# Checks to see if ')' was appended
if (rand == 5):
isNumber = False
isOp = True
numParentheses -= 1
# Checks to see if '(' was appended
elif (rand == 4):
isNumber = True
isOp = False
numParentheses += 1
# All other operations go here
else:
isNumber = True
isOp = False
# Didn't add parentheses possibility here in case expression in parentheses somehow reaches 0
elif (isNumber and isDiv):
expr.append(str(random.randint(1,100)))
isDiv = False
isNumber = False
isOp = True
# If a number's up, decides whether to append parentheses or a number
elif (isNumber):
rand = random.randint(0,1)
if (rand == 0):
expr.append(str(random.randint(0,100)))
isNumber = False
isOp = True
elif (rand == 1):
if (numParentheses == 0):
expr.append('(')
numParentheses += 1
else:
rand = random.randint(0,1)
expr.append(parentheses[rand])
if rand == 0:
numParentheses += 1
else:
numParentheses -= 1
isDiv = False
numExpr -= 1
fname.write(' '.join(expr) + '\n')
fname.close()
lines += 1
回答1:
Yes, you can generate random arithmetic expressions in a Pythonic way. You need to change your approach, though. Don't try to generate a string and count parens. Instead generate a random expression tree, then output that.
By an expression tree, I mean an instance of a class called, say, Expression
with subclasses Number
, PlusExpression,
MinusExpression, 'TimesExpression
, DivideExpression
, and ParenthesizedExpression
. Each of these, except Number
will have fields of type Expression
. Give each a suitable __str__
method. Generate some random expression objects and just print the "root."
Can you take it from here or would you like me to code it up?
ADDENDUM: Some sample starter code. Doesn't generate random expressions (yet?) but this can be added....
# This is just the very beginning of a script that can be used to process
# arithmetic expressions. At the moment it just defines a few classes
# and prints a couple example expressions.
# Possible additions include methods to evaluate expressions and generate
# some random expressions.
class Expression:
pass
class Number(Expression):
def __init__(self, num):
self.num = num
def __str__(self):
return str(self.num)
class BinaryExpression(Expression):
def __init__(self, left, op, right):
self.left = left
self.op = op
self.right = right
def __str__(self):
return str(self.left) + " " + self.op + " " + str(self.right)
class ParenthesizedExpression(Expression):
def __init__(self, exp):
self.exp = exp
def __str__(self):
return "(" + str(self.exp) + ")"
e1 = Number(5)
print e1
e2 = BinaryExpression(Number(8), "+", ParenthesizedExpression(BinaryExpression(Number(7), "*", e1)))
print e2
** ADDENDUM 2 **
Getting back into Python is really fun. I couldn't resist implementing the random expression generator. It is built on the code above. SORRY ABOUT THE HARDCODING!!
from random import random, randint, choice
def randomExpression(prob):
p = random()
if p > prob:
return Number(randint(1, 100))
elif randint(0, 1) == 0:
return ParenthesizedExpression(randomExpression(prob / 1.2))
else:
left = randomExpression(prob / 1.2)
op = choice(["+", "-", "*", "/"])
right = randomExpression(prob / 1.2)
return BinaryExpression(left, op, right)
for i in range(10):
print(randomExpression(1))
Here is the output I got:
(23)
86 + 84 + 87 / (96 - 46) / 59
((((49)))) + ((46))
76 + 18 + 4 - (98) - 7 / 15
(((73)))
(55) - (54) * 55 + 92 - 13 - ((36))
(78) - (7 / 56 * 33)
(81) - 18 * (((8)) * 59 - 14)
(((89)))
(59)
Ain't tooooo pretty. I think it puts out too many parents. Maybe change the probability of choosing between parenthesized expressions and binary expressions might work well....
回答2:
Actually, as long as Ray Toal's response is formally correct, for such a simple problem you don't have to subclass each operator*. I came up with the following code which works pretty well:
import random
import math
class Expression(object):
OPS = ['+', '-', '*', '/']
GROUP_PROB = 0.3
MIN_NUM, MAX_NUM = 0, 20
def __init__(self, maxNumbers, _maxdepth=None, _depth=0):
"""
maxNumbers has to be a power of 2
"""
if _maxdepth is None:
_maxdepth = math.log(maxNumbers, 2) - 1
if _depth < _maxdepth and random.randint(0, _maxdepth) > _depth:
self.left = Expression(maxNumbers, _maxdepth, _depth + 1)
else:
self.left = random.randint(Expression.MIN_NUM, Expression.MAX_NUM)
if _depth < _maxdepth and random.randint(0, _maxdepth) > _depth:
self.right = Expression(maxNumbers, _maxdepth, _depth + 1)
else:
self.right = random.randint(Expression.MIN_NUM, Expression.MAX_NUM)
self.grouped = random.random() < Expression.GROUP_PROB
self.operator = random.choice(Expression.OPS)
def __str__(self):
s = '{0!s} {1} {2!s}'.format(self.left, self.operator, self.right)
if self.grouped:
return '({0})'.format(s)
else:
return s
for i in range(10):
print Expression(4)
It can although be improved to take into considerations things like divisions by zero (not handled currently), customization of all parameters through attributes, allowing any value for the maxNumbers
argument and so on.
* By "simple problem" I mean "generating valid expressions"; if you are adding any other functionality (for example, expression evaluation), then Ray's approach will pay of because you can define the behavior of each subclass in a much cleaner way.
Edit (output):
(5 * 12 / 16)
6 * 3 + 14 + 0
13 + 15 - 1
19 + (8 / 8)
(12 + 3 - 5)
(4 * 0 / 4)
1 - 18 / (3 * 15)
(3 * 16 + 3 * 1)
(6 + 16) / 16
(8 * 10)
回答3:
I found this thread on a similar quest, namely to generate random expressions for unit testing of symbolic calculations. In my version, I included unary functions and allowed the symbols to be arbitrary strings, i.e. you can use numbers or variable names.
from random import random, choice
UNARIES = ["sqrt(%s)", "exp(%s)", "log(%s)", "sin(%s)", "cos(%s)", "tan(%s)",
"sinh(%s)", "cosh(%s)", "tanh(%s)", "asin(%s)", "acos(%s)",
"atan(%s)", "-%s"]
BINARIES = ["%s + %s", "%s - %s", "%s * %s", "%s / %s", "%s ** %s"]
PROP_PARANTHESIS = 0.3
PROP_BINARY = 0.7
def generate_expressions(scope, num_exp, num_ops):
scope = list(scope) # make a copy first, append as we go
for _ in xrange(num_ops):
if random() < PROP_BINARY: # decide unary or binary operator
ex = choice(BINARIES) % (choice(scope), choice(scope))
if random() < PROP_PARANTHESIS:
ex = "(%s)" % ex
scope.append(ex)
else:
scope.append(choice(UNARIES) % choice(scope))
return scope[-num_exp:] # return most recent expressions
As copied from pervious answers, I just throw in some paranthesis around binary operators with probability PROP_PARANTHESIS
(that is a bit of cheating). Binary operators are more common than unary ones, so I left it also for configuration (PROP_BINARY
). An example code is:
scope = [c for c in "abcde"]
for expression in generate_expressions(scope, 10, 50):
print expression
This will generate something like:
e / acos(tan(a)) / a * acos(tan(a)) ** (acos(tan(a)) / a + a) + (d ** b + a)
(a + (a ** sqrt(e)))
acos((b / acos(tan(a)) / a + d) / (a ** sqrt(e)) * (a ** sinh(b) / b))
sin(atan(acos(tan(a)) ** (acos(tan(a)) / a + a) + (d ** b + a)))
sin((b / acos(tan(a)) / a + d)) / (a ** sinh(b) / b)
exp(acos(tan(a)) / a + acos(e))
tan((b / acos(tan(a)) / a + d))
acos(tan(a)) / a * acos(tan(a)) ** (acos(tan(a)) / a + a) + (d ** b + a) + cos(sqrt(e))
(acos(tan(a)) / a + acos(e) * a + e)
((b / acos(tan(a)) / a + d) - cos(sqrt(e))) + sinh(b)
Putting PROP_BINARY = 1.0
and applying with
scope = range(100)
brings us back to output like
43 * (50 * 83)
34 / (29 / 24)
66 / 47 - 52
((88 ** 38) ** 40)
34 / (29 / 24) - 27
(16 + 36 ** 29)
55 ** 95
70 + 28
6 * 32
(52 * 2 ** 37)
回答4:
Ok, I couldn't resist adding my own implementation using some of the ideas we discussed in Ray's answer. I approached a few things differently than Ray did.
I added some handling of the probability of the incidence of each operator. The operators are biased so that the lower priority operators (larger precedence values) are more common than the higher order ones.
I also implemented parentheses only when precedence requires. Since the integers have the highest priority (lowest precedence value) they never get wrapped in parentheses. There is no need for a parenthesized expression as a node in the expression tree.
The probability of using an operator is biased towards the initial levels (using a quadratic function) to get a nicer distribution of operators. Choosing a different exponent gives more potential control of the quality of the output, but I didn't play with the possibilities much.
I further implemented an evaluator for fun and also to filter out indeterminate expressions.
import sys
import random
# dictionary of operator precedence and incidence probability, with an
# evaluator added just for fun.
operators = {
'^': {'prec': 10, 'prob': .1, 'eval': lambda a, b: pow(a, b)},
'*': {'prec': 20, 'prob': .2, 'eval': lambda a, b: a*b},
'/': {'prec': 20, 'prob': .2, 'eval': lambda a, b: a/b},
'+': {'prec': 30, 'prob': .25, 'eval': lambda a, b: a+b},
'-': {'prec': 30, 'prob': .25, 'eval': lambda a, b: a-b}}
max_levels = 3
integer_range = (-100, 100)
random.seed()
# A node in an expression tree
class expression(object):
def __init__(self):
super(expression, self).__init__()
def precedence(self):
return -1
def eval(self):
return 0
@classmethod
def create_random(cls, level):
if level == 0:
is_op = True
elif level == max_levels:
is_op = False
else:
is_op = random.random() <= 1.0 - pow(level/max_levels, 2.0)
if is_op:
return binary_expression.create_random(level)
else:
return integer_expression.create_random(level)
class integer_expression(expression):
def __init__(self, value):
super(integer_expression, self).__init__()
self.value = value
def __str__(self):
return self.value.__str__()
def precedence(self):
return 0
def eval(self):
return self.value
@classmethod
def create_random(cls, level):
return integer_expression(random.randint(integer_range[0],
integer_range[1]))
class binary_expression(expression):
def __init__(self, symbol, left_expression, right_expression):
super(binary_expression, self).__init__()
self.symbol = symbol
self.left = left_expression
self.right = right_expression
def eval(self):
f = operators[self.symbol]['eval']
return f(self.left.eval(), self.right.eval())
@classmethod
def create_random(cls, level):
symbol = None
# Choose an operator based on its probability distribution
r = random.random()
cumulative = 0.0
for k, v in operators.items():
cumulative += v['prob']
if r <= cumulative:
symbol = k
break
assert symbol != None
left = expression.create_random(level + 1)
right = expression.create_random(level + 1)
return binary_expression(symbol, left, right)
def precedence(self):
return operators[self.symbol]['prec']
def __str__(self):
left_str = self.left.__str__()
right_str = self.right.__str__()
op_str = self.symbol
# Use precedence to determine if we need to put the sub expressions in
# parentheses
if self.left.precedence() > self.precedence():
left_str = '('+left_str+')'
if self.right.precedence() > self.precedence():
right_str = '('+right_str+')'
# Nice to have space around low precedence operators
if operators[self.symbol]['prec'] >= 30:
op_str = ' ' + op_str + ' '
return left_str + op_str + right_str
max_result = pow(10, 10)
for i in range(10):
expr = expression.create_random(0)
try:
value = float(expr.eval())
except:
value = 'indeterminate'
print expr, '=', value
I got these results:
(4 + 100)*41/46 - 31 - 18 - 2^-83 = -13.0
(43 - -77)/37^-94 + (-66*67)^(-24*49) = 3.09131533541e+149
-32 - -1 + 74 + 74 - 15 + 64 - -22/98 = 37.0
(-91*-4*45*-55)^(-9^2/(82 - -53)) = 1.0
-72*-85*(75 - 65) + -100*19/48*22 = 61198.0
-57 - -76 - -54*76 + -38 - -23 + -17 - 3 = 4088.0
(84*-19)^(13 - 87) - -10*-84*(-28 + -49) = 64680.0
-69 - -8 - -81^-51 + (53 + 80)^(99 - 48) = 2.07220963807e+108
(-42*-45)^(12/87) - -98 + -23 + -67 - -37 = 152.0
-31/-2*-58^-60 - 33 - -49 - 46/12 = -79.0
There are a couple of things the program does, that although are valid, a human wouldn't do. For example:
- It can create long strings of sequential divides (e.g. 1/2/3/4/5).
- +/- of a negative number is common (e.g. 1 - -2)
These can be corrected with a clean-up pass.
Also, there is no guarantee that the answer is determinate. Divides by 0 and 0^0 are possible, although with the exception handling these can be filtered out.
回答5:
import random
def expr(depth):
if depth==1 or random.random()<1.0/(2**depth-1):
return str(int(random.random() * 100))
return '(' + expr(depth-1) + random.choice(['+','-','*','/']) + expr(depth-1) + ')'
for i in range(10):
print expr(4)
回答6:
Generate an array at random in RPN with mixtures of operators and numbers (always valid). Then start from middle of the array and generate the corresponding evaluation tree.