I need to remove duplicates values in a list, but with set()
or for ... if not in...
loop I only get partially correct decisions.
For example for ['asd', 'dsa', 1, '1', 1.0]
I get:
['asd', 'dsa', 1, '1']
but the required result is:
['asd', 'dsa', 1, '1', 1.0]
How can I achieve this?
You can try
In [3]: [value for _, value in frozenset((type(x), x) for x in l)]
Out[3]: [1.0, '1', 1, 'dsa', 'asd']
We create a (temporary) frozenset
of tuples containing both element and its type - to keep elements that are equal (such as 1, 1.0 and True) but have different types. Then we iterate over it, unpack tuples and retrieve elements (value
).
Sure, we could as well use ordinary set
, which is mutable, but we don't need mutability because our set is temporary.
Note that this won't necessarily preserve the original order.
If you need the original order preserved, use collections.OrderedDict
, which is a hash map (just like regular dict
) and therefore works similarly to frozenset
/set
In [16]: from collections import OrderedDict
In [17]: [value for _, value in OrderedDict.fromkeys((type(x), x) for x in l)]
Out[17]: ['asd', 'dsa', 1, '1', 1.0]
This is a good case for the decorate-sort-undecorate pattern, with the sort part modified to just create a set:
dest = [el for el, ignore
in set((x, type(x))
for x in src)]
The decoration step adds element type to the set, so that e.g. 1 and 1.0 compare different. The final list is then obtained by undecorating the set, i.e. removing the no longer needed type objects.