Why is the use of len(SEQUENCE) in condition value

2019-03-08 17:30发布

问题:

Considering this code snippet:

from os import walk

files = []
for (dirpath, _, filenames) in walk(mydir):
    # more code that modifies files
if len(files) == 0: # <-- C1801
    return None

I was alarmed by Pylint with this message regarding the line with the if statement:

[pylint] C1801:Do not use len(SEQUENCE) as condition value

The rule C1801, at first glance, did not sound very reasonable to me, and the definition on the reference guide does not explain why this is a problem. In fact, it downright calls it an incorrect use.

len-as-condition (C1801): Do not use len(SEQUENCE) as condition value Used when Pylint detects incorrect use of len(sequence) inside conditions.

My search attempts have also failed to provide me a deeper explanation. I do understand that a sequence's length property may be lazily evaluated, and that __len__ can be programmed to have side effects, but it is questionable whether that alone is problematic enough for Pylint to call such a use incorrect. Hence, before I simply configure my project to ignore the rule, I would like to know whether I am missing something in my reasoning.

When is the use of len(SEQ) as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?

回答1:

When is the use of len(SEQ) as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?

It’s not really problematic to use len(SEQUENCE) – though it may not be as efficient (see chepner’s comment). Regardless, Pylint checks code for compliance with the PEP 8 style guide which states that

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq:
     if seq:

No:  if len(seq):
     if not len(seq):

As an occasional Python programmer, who flits between languages, I’d consider the len(SEQUENCE) construct to be more readable and explicit (“Explicit is better then implicit”). However, using the fact that an empty sequence evaluates to False in a Boolean context is considered more “Pythonic”.



回答2:

Note that the use of len(seq) is in fact required (instead of just checking the bool value of seq) when using NumPy arrays.

a = numpy.array(range(10))
if a:
    print "a is not empty"

results in an exception: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

And hence for code that uses both Python lists and NumPy arrays, the C1801 message is less than helpful.