Why isn't “is” comparison used in place of “==

2020-07-18 11:51发布

问题:

When I'm using Pytest for Python formatting, it complains about doing something like:

>>> assert some_function_ret_val() == True
E712 comparison to True should be 'if cond is True:' or 'if cond:'

and wants:

assert some_function_ret_val() is True

I know there can only be one copy of True/False/None, but I thought all primitives are immutable types.

Under what circumstances would "==" and "is" comparison be different for primitive types??

Otherwise, why has "==" become the norm in comparison tasks?

I found this stackoverflow post that talks about comparison with non-primitive types, but I can't seem to find a reason for why "is" comparison might be dangerous with primitive types. Comparison with boolean numpy arrays VS PEP8 E712

If it's just convention, I would think that "is" is more legible than "==", but I feel like there could be some crazy edge cases where maybe there are more than one copy of a primitive type.

回答1:

Python does not have primitive types. Everything in Python is an object.

Generally, the only place you should use is are on language-guaranteed singletons, like True, False, and None or, say for debugging purposes, you actually want to check object identity.

In every other case, you will be relying on implementation details and implementation-specific optimizations if you use is to mean equality (e.g. the peep-hole optimizer and string interning). The equality operator is == and should be used in those cases. While often, the Python interpreter will optimize immutable types, you should still not rely on identity when you mean equality, because mostly that is not a language guarantee.

As an example, while on CPython 3.7, you can "safely" be tempted choose to use is to compare small integers because they are cached, this is an implementation detail that should not be relied upon. This is free to change in Python 3.9 or whenever. Also, see the comment by @user2357112 about how it isn't even necessarily safe for the small integers that are cached! To reiterate: it is not a language guarantee - it is a side-effect of how it was implemented.

And also, again, it only applies to small integers, [-5, 256] so:

>>> def add(a, b): return a + b
...
>>> 16 is add(8, 8)
True
>>> 1000 is add(500, 500)
False

Note, I put the actual addition in a function, the interpreter frequently optimizes immutable literals and arithmetic expressions:

>>> 1000 is (500 + 500)
True

But it should be obvious now why you cannot rely on that.

Another example where it is appropriate to use is for "equalityish" comparisons is to compare enum types, which are guaranteed singletons:

import enum
class Color(enum.Enum):
    RED = 1
    BLUE = 2

RED = Color.RED
BLUE = Color.BLUE

print(Color(1) is RED)