This question already has an answer here:
-
What is the difference between list and list[:] in python?
6 answers
This code is from Python's Documentation. I'm a little confused.
words = ['cat', 'window', 'defenestrate']
for w in words[:]:
if len(w) > 6:
words.insert(0, w)
print(words)
And the following is what I thought at first:
words = ['cat', 'window', 'defenestrate']
for w in words:
if len(w) > 6:
words.insert(0, w)
print(words)
Why does this code create a infinite loop and the first one doesn't?
This is one of the gotchas! of python, that can escape beginners.
The words[:]
is the magic sauce here.
Observe:
>>> words = ['cat', 'window', 'defenestrate']
>>> words2 = words[:]
>>> words2.insert(0, 'hello')
>>> words2
['hello', 'cat', 'window', 'defenestrate']
>>> words
['cat', 'window', 'defenestrate']
And now without the [:]
:
>>> words = ['cat', 'window', 'defenestrate']
>>> words2 = words
>>> words2.insert(0, 'hello')
>>> words2
['hello', 'cat', 'window', 'defenestrate']
>>> words
['hello', 'cat', 'window', 'defenestrate']
The main thing to note here is that words[:]
returns a copy
of the existing list, so you are iterating over a copy, which is not modified.
You can check whether you are referring to the same lists using id()
:
In the first case:
>>> words2 = words[:]
>>> id(words2)
4360026736
>>> id(words)
4360188992
>>> words2 is words
False
In the second case:
>>> id(words2)
4360188992
>>> id(words)
4360188992
>>> words2 is words
True
It is worth noting that [i:j]
is called the slicing operator, and what it does is it returns a fresh copy of the list starting from index i
, upto (but not including) index j
.
So, words[0:2]
gives you
>>> words[0:2]
['hello', 'cat']
Omitting the starting index means it defaults to 0
, while omitting the last index means it defaults to len(words)
, and the end result is that you receive a copy of the entire list.
If you want to make your code a little more readable, I recommend the copy
module.
from copy import copy
words = ['cat', 'window', 'defenestrate']
for w in copy(words):
if len(w) > 6:
words.insert(0, w)
print(words)
This basically does the same thing as your first code snippet, and is much more readable.
Alternatively (as mentioned by DSM in the comments) and on python >=3, you may also use words.copy()
which does the same thing.
words[:]
copies all the elements in words
into a new list. So when you iterate over words[:]
, you're actually iterating over all the elements that words
currently has. So when you modify words
, the effects of those modifications are not visible in words[:]
(because you called on words[:]
before starting to modify words
)
In the latter example, you are iterating over words
, which means that any changes you make to words
is indeed visible to your iterator. As a result, when you insert into index 0 of words
, you "bump up" every other element in words
by one index. So when you move on to the next iteration of your for-loop, you'll get the element at the next index of words
, but that's just the element that you just saw (because you inserted an element at the beginning of the list, moving all the other element up by an index).
To see this in action, try the following code:
words = ['cat', 'window', 'defenestrate']
for w in words:
print("The list is:", words)
print("I am looking at this word:", w)
if len(w) > 6:
print("inserting", w)
words.insert(0, w)
print("the list now looks like this:", words)
print(words)
(In addition to @Coldspeed answer)
Look at the below examples:
words = ['cat', 'window', 'defenestrate']
words2 = words
words2 is words
results: True
It means names word
and words2
refer to the same object.
words = ['cat', 'window', 'defenestrate']
words2 = words[:]
words2 is words
results: False
In this case, we have created the new object.
Let's have a look at iterator and iterables:
An iterable is an object that has an __iter__
method which returns an
iterator, or which defines a __getitem__
method that can take
sequential indexes starting from zero (and raises an IndexError
when
the indexes are no longer valid). So an iterable is an object that you
can get an iterator from.
An iterator is an object with a next
(Python 2) or __next__
(Python 3) method.
iter(iterable)
returns iterator object, and list_obj[:]
returns a new list object, exact copy of list_object.
In your first case:
for w in words[:]
The for
loop will iterate over new copy of the list not the original words. Any change in words has no effect on loop iteration, and the loop terminates normally.
This is how the loop does its work:
loop calls iter
method on iterable and iterates over the iterator
loop calls next
method on iterator object to get next item from iterator. This step is repeated until there are no more elements left
loop terminates when a StopIteration
exception is raised.
In your second case:
words = ['cat', 'window', 'defenestrate']
for w in words:
if len(w) > 6:
words.insert(0, w)
print(words)
You are iterating over the original list words and adding elements to words have a direct impact on the iterator object. So every time your words is updated, the corresponding iterator object is also updated and therefore creates an infinite loop.
Look at this:
>>> l = [2, 4, 6, 8]
>>> i = iter(l) # returns list_iterator object which has next method
>>> next(i)
2
>>> next(i)
4
>>> l.insert(2, 'A')
>>> next(i)
'A'
Every time you update your original list before StopIteration
you will get the updated iterator and next
returns accordingly. That's why your loop runs infinitely.
For more on iteration and the iteration protocol you can look here.