Where might I find a method to convert an arbitrar

2019-03-25 10:42发布

问题:

I've written a little app that parses expressions into abstract syntax trees. Right now, I use a bunch of heuristics against the expression in order to decide how to best evaluate the query. Unfortunately, there are examples which make the query plan extremely bad.

I've found a way to provably make better guesses as to how queries should be evaluated, but I need to put my expression into CNF or DNF first in order to get provably correct answers. I know this could result in potentially exponential time and space, but for typical queries my users run this is not a problem.

Now, converting to CNF or DNF is something I do by hand all the time in order to simplify complicated expressions. (Well, maybe not all the time, but I do know how that's done using e.g. demorgan's laws, distributive laws, etc.) However, I'm not sure how to begin translating that into a method that is implementable as an algorithm. I've looked at papers on query optimization, and several start with "well first we put things into CNF" or "first we put things into DNF", and they never seem to explain their method for accomplishing that.

Where should I start?

回答1:

The naive vanilla algorithm, for quantifier-free formulae, is :

  • for CNF, convert to negation normal form with De Morgan laws then distribute OR over AND
  • for DNF, convert to negation normal form with De Morgan laws then distribute AND over OR

It's unclear to me if your formulae are quantified. But even if they aren't, it seems the end of the wikipedia articles on conjunctive normal form, and its - roughly equivalent in the automated theorm prover world - clausal normal form alter ego outline a usable algorithm (and point to references if you want to make this transformation a bit more clever). If you need more than that, please do tell us more about where you encounter a difficulty.



回答2:

Look at https://github.com/bastikr/boolean.py Example:

def test(self):
    expr = parse("a*(b+~c*d)")
    print(expr)

    dnf_expr = normalize(boolean.OR, expr)
    print(list(map(str, dnf_expr)))

    cnf_expr = normalize(boolean.AND, expr)
    print(list(map(str, cnf_expr)))

Output is:

a*(b+(~c*d))
['a*b', 'a*~c*d']
['a', 'b+~c', 'b+d']

UPDATE: Now I prefer this sympy logic package:

>>> from sympy.logic.boolalg import to_dnf
>>> from sympy.abc import A, B, C
>>> to_dnf(B & (A | C))
Or(And(A, B), And(B, C))
>>> to_dnf((A & B) | (A & ~B) | (B & C) | (~B & C), True)
Or(A, C)


回答3:

I've came across this page: How to Convert a Formula to CNF. It shows the algorithm to convert a Boolean expression to CNF in pseudo code. Helped me to get started into this topic.