Please explain “set difference” in python

2019-08-29 04:16发布

问题:

Trying to learn Python I encountered the following:

>>> set('spam') - set('ham')
set(['p', 's'])

Why is it set(['p', 's']) - i mean: why is 'h' missing?

回答1:

The - operator on python sets is mapped to the difference method, which is defined as the members of set A which are not members of set B. So in this case, the members of "spam" which are not in "ham"are "s" and "p". Notice that this method is not commutative (that is, a - b == b - a is not always true).

You may be looking for the symmetric_difference or ^ method:

>>> set("spam") ^ set("ham")
{'h', 'p', 's'} 

This operator is commutative.



回答2:

Because that is the definition of a set difference. In plain English, it is equivalent to "what elements are in A that are not also in B?".

Note the reverse behavior makes this more obvious

>>> set('spam') - set('ham')
{'s', 'p'}

>>> set('ham') - set('spam')
{'h'}

To get all unique elements, disregarding the order in which you ask, you can use symmetric_difference

>>> set('spam').symmetric_difference(set('ham'))
{'s', 'h', 'p'}


回答3:

There are two different operators:

  • Set difference. This is defined as the elements of A not present in B, and is written as A - B or A.difference(B).
  • Symmetric set difference. This is defined as the elements of either set not present in the other set, and is written as A ^ B or A.symmetric_difference(B).

Your code is using the former, whereas you seem to be expecting the latter.



回答4:

The set difference is the set of all characters in the first set that are not in the second set. 'p' and 's' appear in the first set but not in the second, so they are in the set difference. 'h' does not appear in the first set, so it is not in the set difference (regardless of whether or not it is in the first set).



回答5:

You can also obtain the desired result as:

>>> (set('spam') | set('ham')) - (set('spam') & set('ham'))
set(['p', 's', 'h'])

Create union using | and intersection using & and then do the set difference, i.e. differences between all elements and common elements.



标签: python set