Python removing duplicates in list and 1==1.0 True

2019-08-11 05:29发布

问题:

I need to remove duplicates values in a list, but with set() or for ... if not in... loop I only get partially correct decisions. For example for ['asd', 'dsa', 1, '1', 1.0]

I get:

['asd', 'dsa', 1, '1']

but the required result is:

['asd', 'dsa', 1, '1', 1.0]

How can I achieve this?

回答1:

You can try

In [3]: [value for _, value in frozenset((type(x), x) for x in l)]
Out[3]: [1.0, '1', 1, 'dsa', 'asd']

We create a (temporary) frozenset of tuples containing both element and its type - to keep elements that are equal (such as 1, 1.0 and True) but have different types. Then we iterate over it, unpack tuples and retrieve elements (value).

Sure, we could as well use ordinary set, which is mutable, but we don't need mutability because our set is temporary.

Note that this won't necessarily preserve the original order.


If you need the original order preserved, use collections.OrderedDict, which is a hash map (just like regular dict) and therefore works similarly to frozenset/set

In [16]: from collections import OrderedDict

In [17]: [value for _, value in OrderedDict.fromkeys((type(x), x) for x in l)]
Out[17]: ['asd', 'dsa', 1, '1', 1.0]


回答2:

This is a good case for the decorate-sort-undecorate pattern, with the sort part modified to just create a set:

dest = [el for el, ignore
        in set((x, type(x))
               for x in src)]

The decoration step adds element type to the set, so that e.g. 1 and 1.0 compare different. The final list is then obtained by undecorating the set, i.e. removing the no longer needed type objects.