Retaining order while using Python's set diffe

2019-01-25 07:05发布

I'm doing a set difference operation in Python:

from sets import Set
from mongokit import ObjectId
x = [ObjectId("4f7aba8a43f1e51544000006"), ObjectId("4f7abaa043f1e51544000007"), ObjectId("4f7ac02543f1e51a44000001")]
y = [ObjectId("4f7acde943f1e51fb6000003")]
print list(Set(x).difference(Set(y)))

I'm getting:

[ObjectId('4f7abaa043f1e51544000007'), ObjectId('4f7ac02543f1e51a44000001'), ObjectId('4f7aba8a43f1e51544000006')]

I need to get the first element for next operation which is important. How can I retain the list x in original format?

标签: python set order
3条回答
叼着烟拽天下
2楼-- · 2019-01-25 07:44

It looks like you need an ordered set instead of a regular set.

>>> x = [ObjectId("4f7aba8a43f1e51544000006"), ObjectId("4f7abaa043f1e51544000007"), ObjectId("4f7ac02543f1e51a44000001")]
>>> y = [ObjectId("4f7acde943f1e51fb6000003")]
>>> print list(OrderedSet(x) - OrderedSet(y))
[ObjectId("4f7aba8a43f1e51544000006"), ObjectId("4f7abaa043f1e51544000007"), ObjectId("4f7ac02543f1e51a44000001")]

Python doesn't come with an ordered set, but it is easy to make one:

import collections

class OrderedSet(collections.Set):

    def __init__(self, iterable=()):
        self.d = collections.OrderedDict.fromkeys(iterable)

    def __len__(self):
        return len(self.d)

    def __contains__(self, element):
        return element in self.d

    def __iter__(self):
        return iter(self.d)

Hope this helps :-)

查看更多
倾城 Initia
3楼-- · 2019-01-25 07:54

You could just do this

diff = set(x) - set(y)
[item for item in x if item in diff]

or

filter(diff.__contains__, x)
查看更多
女痞
4楼-- · 2019-01-25 08:03

Sets are unordered, so you will need to put the results back in the correct order after doing your set difference. Fortunately you already have the elements in the order you want, so this is easy.

diff = set(x) - set(y)
result = [o for o in x if o in diff]

But this can be streamlined; you can do the difference as part of the list comprehension (though it is arguably slightly less clear that that's what you're doing).

sety = set(y)
result = [o for o in x if o not in sety]

You could even do it without creating the set from y, but the set will provide fast membership tests, which will save you significant time if either list is large.

查看更多
登录 后发表回答