I have a list of tuples, each tuple of which contains one string and two integers. The list looks like this:
x = [('a',1,2), ('b',3,4), ('x',5,6), ('a',2,1)]
The list contains thousands of such tuples. Now if I want to get unique combinations, I can do the frozenset
on my list as follows:
y = set(map(frozenset, x))
This gives me the following result:
{frozenset({'a', 2, 1}), frozenset({'x', 5, 6}), frozenset({3, 'b', 4})}
I know that set is an unordered data structure and this is normal case but I want to preserve the order of the elements here so that I can thereafter insert the elements in a pandas
dataframe. The dataframe will look like this:
Name Marks1 Marks2
0 a 1 2
1 b 3 4
2 x 5 6
No ordering with frozensets. You can instead create sorted tuples to check for the existence of an item, adding the original if the tuple does not exist in the set:
The first entry gets preserved.
There are some quite useful functions in NumPy which can help you to solve this problem.
Instead of operating on the
set
offrozenset
s directly you could use that only as a helper data-structure - like in theunique_everseen
recipe in the itertools section (copied verbatim):Basically this would solve the issue when you use
key=frozenset
:This returns the elements as-is and it also maintains the relative order between the elements.