I currently have a set like the following:
{(a,b), (b,a), (c,b), (b,c)}
What I Would like to have is:
{(a,b), (c,b)}
As you may notice the duplicate values have been removed completely so that two tuples never have the same elements inside regardless of order.
How can I tell the set to disregard the order of the elements in the tuple and just check the values between the tuples?
Okay, so you've got a set {c1, c2, c3, ...}
, where each cN
is itself a collection of some sort.
If you don't care about the order of the elements in cN
, but do care that it is unique (disregarding order), then cN
should be a frozenset
1 rather than a tuple
:
>>> orig = {("a", "b"), ("b", "a"), ("c", "b"), ("b", "c")}
>>> uniq = {frozenset(c) for c in orig}
>>> uniq
{frozenset(['b', 'a']), frozenset(['b', 'c'])}
As a general rule, choosing an appropriate data type from those provided by Python is going to be more straightforward than defining and maintaining custom classes.
1 It can't be a set
, because as a member of a larger set
it needs to be hashable.
Rather ugly, straightforward solution. You implement equality to treat (2, 3)
and (3, 2)
as the equal objects, you implement __hash__
to disallow equal members in set. You access members as in assertions below.
I'm unhappy with how hashing function looks, but anyway - it's just proof of concept. Hopefully you'll find more elegant solution to calculate it without collisions.
class WhateverItIs(object):
def __init__(self, a, b):
self.a = a
self.b = b
def __eq__(self, other):
return ((self.a == other.a and self.b == other.b) or
(self.a == other.b and self.b == other.a))
def __hash__(self):
return hash(tuple(sorted((self.a, self.b))))
o1 = WhateverItIs(2, 3)
o2 = WhateverItIs(3, 2)
o3 = WhateverItIs(4, 3)
assert {o1, o2, o3} in [{o1, o3}, {o2, o3}]
assert o1 == o2
assert o1.a == 2
assert o1.b == 3
assert o2.a == 3
assert o2.b == 2
assert o3.a == 4
assert o3.b == 3
>>> aa = [('a', 'b'), ('c', 'd'), ('b', 'a')]
>>> seen = set()
>>> a = [seen.add((x,y)) for x,y in aa if (x,y) and (y,x) not in seen ]
>>> list(seen)
[('a', 'b'), ('c', 'd')]