Elegant way to remove items from sequence in Pytho

When I am writing code in Python, I often need to remove items from a list or other sequence type based on some criteria. I haven't found a solution that is elegant and efficient, as removing items from a list you are currently iterating through is bad. For example, you can't do this:

for name in names:
    if name[-5:] == 'Smith':
        names.remove(name)

I usually end up doing something like this:

toremove = []
for name in names:
    if name[-5:] == 'Smith':
        toremove.append(name)
for name in toremove:
    names.remove(name)
del toremove

This is innefficient, fairly ugly and possibly buggy (how does it handle multiple 'John Smith' entries?). Does anyone have a more elegant solution, or at least a more efficient one?

How about one that works with dictionaries?

标签： python optimization set series

14条回答

叛逆

2楼-- · 2020-01-29 03:57

If the list should be filtered in-place and the list size is quite big, then algorithms mentioned in the previous answers, which are based on list.remove(), may be unsuitable, because their computational complexity is O(n^2). In this case you can use the following no-so pythonic function:

def filter_inplace(func, original_list):
  """ Filters the original_list in-place.

  Removes elements from the original_list for which func() returns False.

  Algrithm's computational complexity is O(N), where N is the size
  of the original_list.
  """

  # Compact the list in-place.
  new_list_size = 0
  for item in original_list:
    if func(item):
      original_list[new_list_size] = item
      new_list_size += 1

  # Remove trailing items from the list.
  tail_size = len(original_list) - new_list_size
  while tail_size:
    original_list.pop()
    tail_size -= 1


a = [1, 2, 3, 4, 5, 6, 7]

# Remove even numbers from a in-place.
filter_inplace(lambda x: x & 1, a)

# Prints [1, 3, 5, 7]
print a

Edit: Actually, the solution at https://stackoverflow.com/a/4639748/274937 is superior to mine solution. It is more pythonic and works faster. So, here is a new filter_inplace() implementation:

def filter_inplace(func, original_list):
  """ Filters the original_list inplace.

  Removes elements from the original_list for which function returns False.

  Algrithm's computational complexity is O(N), where N is the size
  of the original_list.
  """
  original_list[:] = [item for item in original_list if func(item)]

0人赞添加讨论(0) 举报

爷的心禁止访问

3楼-- · 2020-01-29 03:58

Two easy ways to accomplish just the filtering are:

Using filter:

names = filter(lambda name: name[-5:] != "Smith", names)
Using list comprehensions:

names = [name for name in names if name[-5:] != "Smith"]

Note that both cases keep the values for which the predicate function evaluates to True, so you have to reverse the logic (i.e. you say "keep the people who do not have the last name Smith" instead of "remove the people who have the last name Smith").

Edit Funny... two people individually posted both of the answers I suggested as I was posting mine.

0人赞添加讨论(0) 举报

时光不老，我们不散

4楼-- · 2020-01-29 04:01

Using a list comprehension

list = [x for x in list if x[-5:] != "smith"]

0人赞添加讨论(0) 举报

家丑人穷心不美

5楼-- · 2020-01-29 04:04

The obvious answer is the one that John and a couple other people gave, namely:

>>> names = [name for name in names if name[-5:] != "Smith"]       # <-- slower

But that has the disadvantage that it creates a new list object, rather than reusing the original object. I did some profiling and experimentation, and the most efficient method I came up with is:

>>> names[:] = (name for name in names if name[-5:] != "Smith")    # <-- faster

Assigning to "names[:]" basically means "replace the contents of the names list with the following value". It's different from just assigning to names, in that it doesn't create a new list object. The right hand side of the assignment is a generator expression (note the use of parentheses rather than square brackets). This will cause Python to iterate across the list.

Some quick profiling suggests that this is about 30% faster than the list comprehension approach, and about 40% faster than the filter approach.

Caveat: while this solution is faster than the obvious solution, it is more obscure, and relies on more advanced Python techniques. If you do use it, I recommend accompanying it with a comment. It's probably only worth using in cases where you really care about the performance of this particular operation (which is pretty fast no matter what). (In the case where I used this, I was doing A* beam search, and used this to remove search points from the search beam.)

0人赞添加讨论(0) 举报

孤傲高冷的网名

6楼-- · 2020-01-29 04:05

To answer your question about working with dictionaries, you should note that Python 3.0 will include dict comprehensions:

>>> {i : chr(65+i) for i in range(4)}

In the mean time, you can do a quasi-dict comprehension this way:

>>> dict([(i, chr(65+i)) for i in range(4)])

Or as a more direct answer:

dict([(key, name) for key, name in some_dictionary.iteritems if name[-5:] != 'Smith'])

0人赞添加讨论(0) 举报

小情绪 Triste *

7楼-- · 2020-01-29 04:05

Well, this is clearly an issue with the data structure you are using. Use a hashtable for example. Some implementations support multiple entries per key, so one can either pop the newest element off, or remove all of them.

But this is, and what you're going to find the solution is, elegance through a different data structure, not algorithm. Maybe you can do better if it's sorted, or something, but iteration on a list is your only method here.

edit: one does realize he asked for 'efficiency'... all these suggested methods just iterate over the list, which is the same as what he suggested.

0人赞添加讨论(0) 举报

1 2 3 下一页

Elegant way to remove items from sequence in Pytho

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间