Is there a need for range(len(a))?

2019-01-21 03:27发布

问题:

One frequently finds expressions of this type in python questions on SO. Either for just accessing all items of the iterable

for i in range(len(a)):
    print(a[i])

Which is just a clumbersome way of writing:

for e in a:
    print(e)

Or for assigning to elements of the iterable:

for i in range(len(a)):
    a[i] = a[i] * 2

Which should be the same as:

for i, e in enumerate(a):
     a[i] = e * 2
# Or if it isn't too expensive to create a new iterable
a = [e * 2 for e in a]

Or for filtering over the indices:

for i in range(len(a)):
    if i % 2 == 1: continue
    print(a[i])

Which could be expressed like this:

for e in a [::2]:
    print(e)

Or when you just need the length of the list, and not its content:

for _ in range(len(a)):
    doSomethingUnrelatedToA()

Which could be:

for _ in a:
    doSomethingUnrelatedToA()

In python we have enumerate, slicing, filter, sorted, etc... As python for constructs are intended to iterate over iterables and not only ranges of integers, are there real-world use-cases where you need in range(len(a))?

回答1:

If you need to work with indices of a sequence, then yes - you use it... eg for the equivalent of numpy.argsort...:

>>> a = [6, 3, 1, 2, 5, 4]
>>> sorted(range(len(a)), key=a.__getitem__)
[2, 3, 1, 5, 4, 0]


回答2:

Short answer: mathematically speaking, no, in practical terms, yes, for example for Intentional Programming.

Technically, I think the mathematically correct answer would be "no, it's not needed" because it's expressible using other constructs, i.e. it's equivalent to other constructs... something like if a language is Turing complete, it doesn't really matter which syntactic/paradigmatic constructs it has because everything can be expressed in it anyway.

But in practice, I use for i in range(len(a) (or for _ in range(len(a)) if I don't need the index) to make it explicit that I want to iterate as many times as there are items in a sequence without needing to use the items in the sequence for anything.

So to answer the "Is there a need?" part —I need it to express the meaning/intent of the code for readability purposes.

See also: https://en.wikipedia.org/wiki/Intentional_programming

P.S. but at the same time, the following seems to be semantically equivalent, from an Intentional Programming point of view:

for _ in a:
    ...

or

b = ["hello" for _ in a]

...all in all, I guess the difference is whether you want to be really explicit about "repeat AS MANY TIMES as there are items in a" as opposed to "for every element in a, regardless of the content of a" ...so just an Intentional Programming nuance in the end.



回答3:

What if you need to access two elements of the list simultaneously?

for i in range(len(a[0:-1])):
    something_new[i] = a[i] * a[i+1]

You can use this, but it's probably less clear:

for i, _ in enumerate(a[0:-1]):
     something_new[i] = a[i] * a[i+1]

Personally I'm not 100% happy with either!



回答4:

Going by the comments as well as personal experience, I say no, there is no need for range(len(a)). Everything you can do with range(len(a)) can be done in another (usually far more efficient) way.

You gave many examples in your post, so I won't repeat them here. Instead, I will give an example for those who say "What if I want just the length of a, not the items?". This is one of the only times you might consider using range(len(a)). However, even this can be done like so:

>>> a = [1, 2, 3, 4]
>>> for _ in a:
...     print True
...
True
True
True
True
>>>

Clements answer (as shown by Allik) can also be reworked to remove range(len(a)):

>>> a = [6, 3, 1, 2, 5, 4]
>>> sorted(range(len(a)), key=a.__getitem__)
[2, 3, 1, 5, 4, 0]
>>> # Note however that, in this case, range(len(a)) is more efficient.
>>> [x for x, _ in sorted(enumerate(a), key=lambda i: i[1])]
[2, 3, 1, 5, 4, 0]
>>>

So, in conclusion, range(len(a)) is not needed. Its only upside is readability (its intention is clear). But that is just preference and code style.



回答5:

I have an use case I don't believe any of your examples cover.

boxes = [b1, b2, b3]
items = [i1, i2, i3, i4, i5]
for j in range(len(boxes)):
    boxes[j].putitemin(items[j])

I'm relatively new to python though so happy to learn a more elegant approach.



回答6:

It's nice to have when you need to use the index for some kind of manipulation and having the current element doesn't suffice. Take for instance a binary tree that's stored in an array. If you have a method that asks you to return a list of tuples that contains each nodes direct children then you need the index.

#0 -> 1,2 : 1 -> 3,4 : 2 -> 5,6 : 3 -> 7,8 ...
nodes = [0,1,2,3,4,5,6,7,8,9,10]
children = []
for i in range(len(nodes)):
  leftNode = None
  rightNode = None
  if i*2 + 1 < len(nodes):
    leftNode = nodes[i*2 + 1]
  if i*2 + 2 < len(nodes):
    rightNode = nodes[i*2 + 2]
  children.append((leftNode,rightNode))
return children

Of course if the element you're working on is an object, you can just call a get children method. But yea, you only really need the index if you're doing some sort of manipulation.



回答7:

Sometimes matplotlib requires range(len(y)), e.g., while y=array([1,2,5,6]), plot(y) works fine, scatter(y) does not. One has to write scatter(range(len(y)),y). (Personally, I think this is a bug in scatter; plot and its friends scatter and stem should use the same calling sequences as much as possible.)



回答8:

Sometimes, you really don't care about the collection itself. For instance, creating a simple model fit line to compare an "approximation" with the raw data:

fib_raw = [1, 1, 2, 3, 5, 8, 13, 21] # Fibonacci numbers

phi = (1 + sqrt(5)) / 2
phi2 = (1 - sqrt(5)) / 2

def fib_approx(n): return (phi**n - phi2**n) / sqrt(5)

x = range(len(data))
y = [fib_approx(n) for n in x]

# Now plot to compare fib_raw and y
# Compare error, etc

In this case, the values of the Fibonacci sequence itself were irrelevant. All we needed here was the size of the input sequence we were comparing with.



回答9:

Very simple example:

def loadById(self, id):
    if id in range(len(self.itemList)):
        self.load(self.itemList[id])

I can't think of a solution that does not use the range-len composition quickly.

But probably instead this should be done with try .. except to stay pythonic i guess..



回答10:

If you have to iterate over the first len(a) items of an object b (that is larger than a), you should probably use range(len(a)):

for i in range(len(a)):
    do_something_with(b[i])


回答11:

My code is:

s=["9"]*int(input())
for I in range(len(s)):
    while not set(s[I])<=set('01'):s[i]=input(i)
print(bin(sum([int(x,2)for x in s]))[2:])

It is a binary adder but I don't think the range len or the inside can be replaced to make it smaller/better.