I've got a string with words that are separated by spaces (all words are unique, no duplicates). I turn this string into list:
s = "#one cat #two dogs #three birds"
out = s.split()
And count how many values are created:
print len(out) # Says 192
Then I try to delete everything from the list:
for x in out:
out.remove(x)
And then count again:
print len(out) # Says 96
Can someone explain please why it says 96 instead of 0?
MORE INFO
Each line starts with '#' and is in fact a space-separated pair of words: the first in the pair is the key and second is the value.
So, what I am doing is:
for x in out:
if '#' in x:
ind = out.index(x) # Get current index
nextValue = out[ind+1] # Get next value
myDictionary[x] = nextValue
out.remove(nextValue)
out.remove(x)
The problem is I cannot move all key,value-pairs into a dictionary since I only iterate through 96 items.
As for what actually happened in the for loop:
I think it is best shown with the aid of an illustration.
Now, suppose you have an
iterable object
(such aslist
) like this:What happen when you do
for x in out
is that it creates internal indexer which goes like this (I illustrate it with the symbol^
):What normally happen is that: as you finish one cycle of your loop, the indexer moves forward like this:
Thus when you do
remove
, this is what happened internally:Notice that there are only 3 cycles there instead of 6 cycles(!!) (which is the number of the elements in the original list). And that's why you left with half
len
of your originallen
, because that is the number of cycles it takes to complete the loop when you remove one element from it for each cycle.If you want to clear the list, simply do:
Or, alternatively, to remove the element one by one, you need to do it the other way around - from the end to the beginning. Use
reversed
:Now, why would the
reversed
work? If the indexer keeps moving forward, wouldn'treversed
also should not work because the number of element is reduced by one per cycle anyway?No, it is not like that,
To illustrate, this is what normally happens:
And thus when you do one removal per cycle, it doesn't affect how the indexer works:
Hope the illustration helps you to understand what's going on internally...
The problem is whenever you delete a value from the list, that particular list restores its values dynamically. That is, when you perform
out.remove(ind)
andout.remove(ind+1)
, the values in these indexes are deleted, but they are replaced with new values which are predecessor of the previous value.Therefore to avoid this you have to implement the code as follows :
So, after you are done transferring the values from the list to dictionary, we could safely empty the
out
by usingout = []
The problem is you are using remove(x) while iterating. 'out' variable is referring both in remove function and for-loop.
Just use
First you split on '#' to get each record (a string of key,value pair). Then you split each o on space, to give you a list of [key,value].
dict()
allows you to construct the dict directly from a list of key,value-pairs. So:(Note: we had to use
s.split('#')[1:]
to skip the first (blank) record)The problem you're encountering is the result of modifying a list while iterating over it. When an item is removed, everything after it gets moved forward by one index, but the iterator does not account for the change and continues by incrementing the index it last accessed. The iterator thus skips every second element in the list, which is why you're left with half the number of elements.
The simplest direct solution to your problem is to iterate over a copy of
out
, using slice notation:However, there is a deeper question here: why do you need to remove items from the list at all? With your algorithm, you are guaranteed to end up with an empty list, which is of no use to you. It would be both simpler and more efficient to just iterate over the list without removing items.
When you're done with the list (after the for-loop block) you can explicitly delete it (using the
del
keyword) or simply leave it for Python's garbage collection system to deal with.A further issue remains: you're combining direct iteration over a list with index-based references. The use of
for x in out
should typically be restricted to situations where you want to access each element independently of the others. If you want to work with indices, usefor i in range(len(out))
and access elements without[i]
.Furthermore, you can use a dictionary comprehension to accomplish your entire task in a one-line pythonic expression:
Another pythonic alternative would be to make use of the fact that each even-numbered element is a key, and each odd-numbered element is a value (you'd have to assume that the list result of
str.split()
consistently follows this pattern), and usezip
on the even and odd sub-lists.If you just need to clear the list,
use
out = []
orout.clear()
Anyway, that you said is because
remove
function of list affects list.then result is shown below:
a c e
It is exactly half of full list. So, in your case, you got 96(half of 192) from 192.