How to slice a deque? [duplicate]

2019-02-05 21:48发布

问题:

This question already has an answer here:

  • Use slice notation with collections.deque 5 answers

I've changed some code that used a list to using a deque. I can no longer slice into it, as I get the error:

TypeError: sequence index must be integer, not 'slice'

Here's a REPL that shows the problem.

>>> import collections
>>> d = collections.deque()
>>> for i in range(3):
...     d.append(i)
...
>>> d
deque([0, 1, 2])
>>> d[2:]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence index must be integer, not 'slice'

So, is there a workaround to support slicing into deques in Python?

回答1:

Try itertools.islice().

 deque_slice = collections.deque(itertools.islice(my_deque, 10, 20))

Indexing into a deque requires following a linked list from the beginning each time, so the islice() approach, skipping items to get to the start of the slice, will give the best possible performance (better than coding it as an index operation for each element).

You could easily write a deque subclass that does this automagically for you.

class sliceable_deque(collections.deque):
    def __getitem__(self, index):
        if isinstance(index, slice):
            return type(self)(itertools.islice(self, index.start,
                                               index.stop, index.step))
        return collections.deque.__getitem__(self, index)

Note that you can't use negative indices or step values with islice. It's possible to code around this, and might be worthwhile to do so if you take the subclass approach. For negative start or stop you can just add the length of the deque; for negative step, you'll need to throw a reversed() in there somewhere. I'll leave that as an exercise. :-)

The performance of retrieving individual items from the deque will be slightly reduced by the if test for the slice. If this is an issue, you can use an EAFP pattern to ameliorate this somewhat -- at the cost of making the slice path slightly less performant due to the need to process the exception:

class sliceable_deque(collections.deque):
    def __getitem__(self, index):
        try:
            return collections.deque.__getitem__(self, index)
        except TypeError:
            return type(self)(itertools.islice(self, index.start,
                                               index.stop, index.step))

Of course there's an extra function call in there still, compared to a regular deque, so if you really care about performance, you really want to add a separate slice() method or the like.



回答2:

If performance is a concern, consider a direct access/comprehension method as suggested in this answer. It's much faster than islice on large collections:

import timeit

setup = """
import collections, itertools
d = collections.deque(range(10000))
"""

print timeit.timeit('list(itertools.islice(d, 9000, 9010))', setup, number=10000)
## 0.631947040558
print timeit.timeit('[d[i] for i in range(9000, 9010)]', setup, number=10000)
## 0.0292208194733

As per @RaymondHettinger comment below, the comprehension method is only better when slices are short. On longer slices, islice convincingly wins. For example, here are timings for slicing a 10,000 items deque from the offset 6000:

offset  length      islice       compr
 6000      10      400.496      46.611
 6000      50      424.600     183.988
 6000      90      432.277     237.894
 6000     130      441.289     352.383
 6000     170      431.299     404.596
 6000     210      456.405     546.503
 6000     250      448.895     575.995
 6000     290      485.802     778.294
 6000     330      483.704     781.703
 6000     370      490.904     948.501
 6000     410      500.011     875.807
 6000     450      508.213    1045.299
 6000     490      518.894    1010.203
 6000     530      530.887    1192.784
 6000     570      534.415    1151.013
 6000     610      530.887    1504.779
 6000     650      539.279    1486.802
 6000     690      536.084    1650.810
 6000     730      549.698    1454.687
 6000     770      564.909    1576.114
 6000     810      545.001    1588.297
 6000     850      564.504    1711.607
 6000     890      584.197    1760.793
 6000     930      564.480    1963.091
 6000     970      586.390    1955.199
 6000    1010      590.706    2117.491

The comprehension does first few slices very fast, but the performance falls down dramatically as the length grows. islice is slower on smaller slices, but its average speed is much better.

This is how I tested:

import timeit

size = 10000
repeats = 100

setup = """
import collections, itertools
d = collections.deque(range(%d))
""" % size

print '%5s\t%5s\t%10s\t%10s' % ('offset', 'length', 'islice', 'compr')

for offset in range(0, size - 2000, 2000):
    for length in range(10, 2000, 40):
        t1 = timeit.timeit('list(itertools.islice(d, %d, %d))' % (offset, offset + length), setup, number=repeats)
        t2 = timeit.timeit('[d[i] for i in range(%d, %d)]' % (offset, offset + length), setup, number=repeats)
        print '%5d\t%5d\t%10.3f\t%10.3f' % (offset, length, t1 * 100000, t2  * 100000)