Unexpected output using Pythons' ternary opera

2019-06-16 03:18发布

问题:

I have a specific situation in which I would like to do the following (actually it is more involved than this, but I reduced the problem to the essence):

>>> (lambda e: 1)(0) if (lambda e: True)(0) else (lambda e: 2)(0)
True

which is a difficult way of writing:

>>> 1 if True else 2
1

but in reality '1','True' and '2' are additional expressions that get evaluated and which require the variable 'e', which I set to '0' for this simplified code example.

Note the difference in output from both expressions above, although

>>> (lambda e: 1)(0)
1
>>> (lambda e: True)(0)
True
>>> (lambda e: 2)(0)
2

The funny thing is that this is a special case, because if I replace '1' by '3' I get the expected/desired result:

>>> (lambda e: 3)(0) if (lambda e: True)(0) else (lambda e: 2)(0)
3

It's even correct if I replace '1' by '0' (which could also be a special case since 1==True and 0==False)

>>> (lambda e: 0)(0) if (lambda e: True)(0) else (lambda e: 2)(0)
0

Also, if I replace 'True' by 'not False' or 'not not True', it still works:

>>> (lambda e: 1)(0) if (lambda e: not False)(0) else (lambda e: 2)(0)
1
>>> (lambda e: 1)(0) if (lambda e: not not True)(0) else (lambda e: 2)(0)
1

Another alternative formulation uses the usual if..then..else statement and does not produce the error:

>>> if (lambda e: True)(0):
    (lambda e: 1)(0)
else:
    (lambda e: 2)(0)

1

What explains this behavior? How can I solve this behavior in a nice way (avoid to use 'not not True' or something?

Thanks!

回答1:

I think I figured out why the bug is happening, and why your repro is Python 3 specific.

Code objects do equality comparisons by value, rather than by pointer, strangely enough:

static PyObject *
code_richcompare(PyObject *self, PyObject *other, int op)
{
    ...

    co = (PyCodeObject *)self;
    cp = (PyCodeObject *)other;

    eq = PyObject_RichCompareBool(co->co_name, cp->co_name, Py_EQ);
    if (eq <= 0) goto unequal;
    eq = co->co_argcount == cp->co_argcount;
    if (!eq) goto unequal;
    eq = co->co_kwonlyargcount == cp->co_kwonlyargcount;
    if (!eq) goto unequal;
    eq = co->co_nlocals == cp->co_nlocals;
    if (!eq) goto unequal;
    eq = co->co_flags == cp->co_flags;
    if (!eq) goto unequal;
    eq = co->co_firstlineno == cp->co_firstlineno;
    if (!eq) goto unequal;

    ...

In Python 2, lambda e: True does a global name lookup and lambda e: 1 loads a constant 1, so the code objects for these functions don't compare equal. In Python 3, True is a keyword and both lambdas load constants. Since 1 == True, the code objects are sufficiently similar that all the checks in code_richcompare pass, and the code objects compare the same. (One of the checks is for line number, so the bug only appears when the lambdas are on the same line.)

The bytecode compiler calls ADDOP_O(c, LOAD_CONST, (PyObject*)co, consts) to create the LOAD_CONST instruction that loads a lambda's code onto the stack, and ADDOP_O uses a dict to keep track of objects it's added, in an attempt to save space on stuff like duplicate constants. It has some handling to distinguish things like 0.0, 0, and -0.0 that would otherwise compare equal, but it wasn't expected that they'd ever need to handle equal-but-inequivalent code objects. The code objects aren't distinguished properly, and the two lambdas end up sharing a single code object.

By replacing True with 1.0, we can reproduce the bug on Python 2:

>>> f1, f2 = lambda: 1, lambda: 1.0
>>> f2()
1

I don't have Python 3.5, so I can't check whether the bug is still present in that version. I didn't see anything in the bug tracker about the bug, but I could have just missed the report. If the bug is still there and hasn't been reported, it should be reported.