What is the difference between list and iterator i

2019-01-19 07:06发布

问题:

I am reading the book Think Python: How to think like a computer scientist, which says that in Python 3.x, dict([list of tuples]) returns an iterator instead of a list (as is the case in Python 2.7).

The book did not explain it any further, which has left me confused. In particular, I would like to know:

  1. How are iterators and lists different, and

  2. What is the advantage of returning an iterator over a list?

回答1:

First of all, your book is wrong (or you've misunderstood it):

>>> dict([(1, 2), (3, 4), (5, 6)])
{1: 2, 3: 4, 5: 6}

As you can see, dict([list of tuples]) returns a dictionary in both Python 2.x and 3.x.

The fundamental difference between a list and an iterator is that a list contains a number of objects in a specific order - so you can, for instance, pull one of them out from somewhere in the middle:

>>> my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list
['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list[3]
'd'

... whereas an iterator yields a number of objects in a specific order, often creating them on the fly as requested:

>>> my_iter = iter(range(1000000000000))
>>> my_iter
<range_iterator object at 0x7fa291c22600>
>>> next(my_iter)
0
>>> next(my_iter)
1
>>> next(my_iter)
2

I'm using next() here for demonstration purposes; in real code it's more common to iterate over an iterator with a for loop:

for x in my_iter:
    # do something with x

Notice the trade-off: a list of a trillion integers would use more memory than most machines have available, which makes the iterator much more efficient ... at the cost of not being able to ask for an object somewhere in the middle:

>>> my_iter[37104]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'range_iterator' object is not subscriptable


回答2:

A list is a data structure that holds a sequence of values. An iterator is an object that provides an interface to retrieve values one at a time, via the next function.

An iterable object is one that provides a __iter__ method, which is invoked when you pass an iterable to the iter function. You don't often need to do this explicitly; a for loop, for example, does it implicitly. A loop like

for x in [1,2,3]:
    print x

automatically invokes the list's __iter__ method. You can do so explicitly with

for x in iter([1,2,3]):
    print x

or even more explicitly with

for x in [1,2,3].__iter__():
    print x

One way to see the difference is to create two iterators from a single list.

l = [1, 2, 3, 4, 5]
i1 = iter(l)
i2 = iter(l)
print next(i1)   # 1
print next(i1)   # 2
print next(i2)   # 1 again; i2 is separate from i1
print l          # [1, 2, 3, 4, 5]; l is unaffected by i1 or i2


回答3:

The iterator is the mechanism by which you can iterate over a list or some other set of objects/values using for. A list implements an iterator. But you can also implement iterators that return number sequences, random strings, etc.

When you return an iterator, you are merely returning the iteration object; the receiving code doesn't know anything about the underlying container or generator algorithm.

Iterators are lazy; they only return the next element in the sequence or list when asked to do so. You can therefore implement infinite sequences with them.

Further Reading
Iterator Types
The for statement



回答4:

An iterator is an object that yields values, but is not necessarily associated to an in-memory datastructure containing all the values to be yielded. A list, by contrast, is fully constructed and resident in memory. Basically, iterators are usually more memory efficient, and often more performant than the same data created as an in-memory structure, as all per-element calculation can be done when the element is accessed instead of front-loaded, and all the elements don't need to be resident in memory.



回答5:

The Critical definitions here are :

  • List : Fully stored in memory, and it will also be an iterator - i.e. you can go from one element to the next.
  • Iterable : Any object which implements the Iterator protocol - i.e. allow you to go from one element to the next. It could use data stored in memory, it could be a file, or each step could be calculated.

Many things are iterables which aren't lists, all lists are iterables



回答6:

You should read the Python documentation's guide to the iterator protocol here: https://docs.python.org/2/library/stdtypes.html#iterator-types

Basically, iterators in Python are those which conform to a general protocol for iterating over elements in a container. A list is a specific container type that conforms to that protocol.



回答7:

Lists are a sub-set of all iterable objects, you can have iterable objects that are not lists.