Jaccard distance - union and intersection of lists

2019-09-08 08:56发布

问题:

I am implementing hierarchical clustering using Jaccard distance. The transactions for which I am trying to find Jaccard are represented in binary. For eg.:

t1=['0','1','1','0','1']

t2=['1','0','1','0','0'].

I looked at this SO question, which is very similar to what I want, but I am not getting the right answer.

Basically this is what I am looking for:
1. find intersection and union for the above 2 lists.

I have tried the below apart from looking at numerous other online resources:

1. s1=sets.Set(['0','1','1','0','1'])
   s2=sets.Set(['1','0','1','0','0'])  
2. s1.intersection(s2)  ---> Set(['1', '0'])  
   s1.union(s2)         ---> Set(['1', '0'])  
3. Set(s1) & Set(s2)      ---> TypeError: unsupported operand type(s) for /: 'Set' and 'Set'

   Set(s1) | Set(s2)

Please guide me.

Thanks.

回答1:

As you said:

s1=sets.Set(['0','1','1','0','1'])

Let's check s1:

print s1
---->Set(['1', '0'])

sets module provides classes for constructing and manipulating unordered collections of unique elements. So, your s1 and s2 are actually the same.