Difference between if and if is not No

2019-02-06 15:33发布

问题:

In writing some XML parsing code, I received the warning:

FutureWarning: The behavior of this method will change in future versions.  Use specific 'len(elem)' or 'elem is not None' test instead.

where I used if <elem>: to check if a value was found for a given element.

Can someone elaborate on the difference between if <obj>: vs if <obj> is not None: and why Python cares which I use?

I almost always use the former because it's shorter and not a double-negative, but often see the latter in other people's source code.

回答1:

if obj is not None test whether the object is not None. if obj tests whether bool(obj) is True.

There are many objects which are not None but for which bool(obj) is False: for instance, an empty list, an empty dict, an empty set, an empty string. . .

Use if obj is not None when you want to test if an object is not None. Use if obj only if you want to test for general "falseness" -- whose definition is object-dependent.



回答2:

This answer addresses the FutureWarning specifically.

When lxml was first written, the lxml.etree._Element was considered falsey if it had no children.

As a result, this can happen:

>>> from lxml import etree
>>> 
>>> root = etree.fromstring('<body><h1>Hello</h1></body>')
>>> print root
<Element body at 0x41d7680>
>>> print "root is not Falsey" if root else "root is Falsey"
<string>:1: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
root is not Falsey
>>> # that's odd, a warning
>>> h1 = root.find('.//h1')
>>> print h1
<Element h1 at 0x41d7878>
>>> print "h1 is not Falsey" if h1 else "h1 is Falsey"
h1 is Falsey
>>> # huh, that is weird! In most of python, an object is rarely False
>>> # we did see a warning though, didn't we?
>>> # let's see how the different elements output
>>> print "root is not None" if root is not None else "root is None"
root is not None
>>> print "h1 is not None" if h1 is not None else "h1 is None"
h1 is not None
>>> print "Length of root is ", len(root)
Length of root is  1
>>> print "Length of h1 is ", len(h1)
Length of h1 is  0
>>> # now to look for something that's not there!
>>> h2 = root.find('.//h2')
>>> print h2
None
>>> print "h2 is not Falsey" if h2 else "h2 is Falsey"
h2 is Falsey
>>> print "h2 is not None" if h2 is not None else "h2 is None"
h2 is None
>>> print "Length of h2 is ", len(h2)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: object of type 'NoneType' has no len()
Length of h2 is  >>> 

lxml has been promising for 7+ years that this change is going to happen (after going through several versions) but has never followed through on the threat, no doubt because of how central lxml is and fears it would break a lot of existing code.

However, to be both explicit and sure you don't make a mistake, never use if obj or if not obj if that object has a type of lxml.etree._Element.

Instead, use one of the following checks:

obj = root.find('.//tag')

if obj is not None:
    print "Object exists" 

if obj is None:
    print "Object does not exist/was not found"

if len(obj): # warning: if obj is None (no match found) TypeError
    print "Object has children"

if not len(obj): # warning: if obj is None (no match found) TypeError
    print "Object does not have children"


回答3:

For a full explanation consider the following example:

>>> import dis
>>> def is_truthy(x):
>>>    return "Those sweed words!" if x else "All lies!"
>>> is_truthy(None)
'All lies!'
>>> is_truthy(1)
'Those sweed words!'
>>> is_truthy([])
'All lies!'
>>> is_truthy(object())
'Those sweed words!'

What's happening in is_truthy() ? Let's find out. Running dis.dis(is_truthy) gives you:

   2           0 LOAD_FAST                0 (x)
               3 POP_JUMP_IF_FALSE       10
               6 LOAD_CONST               1 ('The pure word')
               9 RETURN_VALUE        
         >>   10 LOAD_CONST               2 ('All lies!')
              13 RETURN_VALUE

As you can see x is pushed onto the stack, then POP_JUMP_IF_FALSE is executed. This will take the jump to first push and then return the right answer.

POP_JUMP_IF_FALSE is defined in ceval.c:

TARGET(POP_JUMP_IF_FALSE) {
    PyObject *cond = POP();
    int err;
    if (cond == Py_True) {
        Py_DECREF(cond);
        FAST_DISPATCH();
    }
    if (cond == Py_False) {
        Py_DECREF(cond);
        JUMPTO(oparg);
        FAST_DISPATCH();
    }
    err = PyObject_IsTrue(cond);
    Py_DECREF(cond);
    if (err > 0)
        err = 0;
    else if (err == 0)
        JUMPTO(oparg);
    else
        goto error;
    DISPATCH();

As you can see, if the object consumed by POP_JUMP_IF_FALSE is already either True or False, the answer is simple. Otherwise the interpreter tries to find out if the object is truthy by calling PyObject_IsTrue() which is defined in the object protocol. The code in object.c shows you exactly how it works:

PyObject_IsTrue(PyObject *v)
{
    Py_ssize_t res;
    if (v == Py_True)
        return 1;
    if (v == Py_False)
        return 0;
    if (v == Py_None)
        return 0;
    else if (v->ob_type->tp_as_number != NULL &&
             v->ob_type->tp_as_number->nb_bool != NULL)
        res = (*v->ob_type->tp_as_number->nb_bool)(v);
    else if (v->ob_type->tp_as_mapping != NULL &&
             v->ob_type->tp_as_mapping->mp_length != NULL)
        res = (*v->ob_type->tp_as_mapping->mp_length)(v);
    else if (v->ob_type->tp_as_sequence != NULL &&
             v->ob_type->tp_as_sequence->sq_length != NULL)
        res = (*v->ob_type->tp_as_sequence->sq_length)(v);
    else
        return 1;
    /* if it is negative, it should be either -1 or -2 */
    return (res > 0) ? 1 : Py_SAFE_DOWNCAST(res, Py_ssize_t, int);
}

Again, if the object is just True or False themselves, the answer is simple. None is also considered false. Then various protocols like the number protocol, the mapping protocol and the sequence protocol are checked. Otherwise the object is considered true.

To wrap it up: x is considered true if it is True, true according to the number, mapping or sequence protocol or some other kind of object. If you want your object to evaluate to false, you can do so by implementing any of said protocols, see the provided links.

Comparing to None like in if x is None is an explicit comparison. The logic above does not apply.



回答4:

The behavior of if x is sort of interesting:

In [1]: def truthy(x):
...:     if x:
...:         return 'Truthy!'
...:     else:
...:         return 'Not truthy!'
...:     

In [2]: truthy(True)
Out[2]: 'Truthy!'

In [3]: truthy(False)
Out[3]: 'Not truthy!'

In [4]: truthy(0)
Out[4]: 'Not truthy!'

In [5]: truthy(1)
Out[5]: 'Truthy!'

In [6]: truthy(None)
Out[6]: 'Not truthy!'

In [7]: truthy([])
Out[7]: 'Not truthy!'

In [8]: truthy('')
Out[8]: 'Not truthy!'

So, for example, statements under the conditional if x will not execute if x is 0, None, the empty list, or the empty string. On the other hand if x is not None will only apply when x is exactly None.